IUP Computer Science
COSC 310    Fall 07

Project #3
Data Patterns
(Due 10 October 07)

Write a program to analyze binary data files to find occurrences of a user-specified pattern in the data.  Two data files are provided for you   apple.dat   and   sassafras.dat   Both files are available on the I:\ drive in I:\jlwolfe\310\f07   Each file consists of thousands of sensor reading values, each value is a single byte, holding a value in the range 0 to 9.  Your program is to prompt the person at the keyboard for a data pattern to search for; the person might enter  2815 for example, indicating that the program is to search for the pattern 2 followed by 8 followed by 1 followed by 5, in consecutive file positions.  The program must report the starting position of all occurrences of this pattern in the file.  Naturally, the positions in the data file are numbered starting at 0.

Requirements:

  1. During the searching for the pattern, the program must represent the file data in an ArrayList and must represent the pattern being searched for in an ArrayList.
  2. When the user enters the pattern to search for, s/he must enter as if it were a single number.  For example, it must be 2815, not 2 8 1 5  I recommend reading this pattern as a string, then converting it to an ArrayList of values that can match the binary data one digit at a time.
After you have your program working, do an algorithm analysis of the search you are using to find the patterns.  Determine the order-of (Big-oh) computing time for your algorithm.  I would expect that Big-oh would involve two variables, the number of data values in the file and the number of values in the pattern you are searching for.  Justify the Big-oh value that you get (i.e., give a written argument - you may write it on the printout).

Two or three days before the project is due, I will give you the patterns that I want you to search for in each file.  You are to collect the output from the console showing where your program found the patterns I specify and print that output. If you feel the need to see what is in the binary files, I recommend opening them with Visual Studio; it will display them in hexadecimal; but because each value is a single byte, it should be easy to read.  Here are some sample results for a few simple patterns on the file apple.dat

Found 888 at positions  4751  4752  4985  5871  6070  6850  9328
Found 6789 at positions  350  9761
Found 0101 at positions  2562  9975
Found 221242470141214 at positions  9469
123456 not found in data

Hand in a printout of your well-documented program, and a printout of the search results (captured from the screen). Also, hand in your Big-oh analysis and result on paper.  You must create a folder named p3 under the folder named after you on the P: drive for COSC 310.  Copy into p3 all .java files that you created for this project.