IUP Computer Science
COSC 310      Fall 2006

Project #6
(Consultant's Opinion)
(Due 1 December 2006)

A dispute has arisen related to the best approach to use in checking the credit-card transactions for frequent activity at the central clearinghouse.  As we know from project #5, the in-house analysts have recommended using a radix sort with string keys (credit card numbers) and then finding adjacent duplicates in the sorted list.  But, the CIO brought in a consultant who recommends a different approach.  The consultant wants to do a tree sort based on long integer keys (credit card numbers) and check for duplicates during the sorting.  The consultant claims that even without balancing the tree, this approach will be faster.  The in-house analysts say that even with relatively random keys, the tree will be skewed enough to make it slower than using the radix sort.  So, ...

Your first task is to read the card-events.txt file and store the transactions in a binary search tree, BST (thus, performing the principal work of a tree sort).  Because the nodes of a BST should have unique keys, if a duplicate key is read from the file, it will be found in the BST.  The program can then save the information about the two transactions to create a report about the duplicate.  The program must not put the transaction with the duplicate key into the BST.

I am requiring you to use the SearchTree interface and the BinaryTree and BinarySearchTree classes that are given in the book.  These classes do not attempt to balance the tree.  You will have to add a couple of unimplemented interface methods to the BinarySearchTree class; and you will have to define a compareTo method for the transaction class.  Otherwise, everything to do the sort is already in these classes.  You can get the SearchTree interface and the two classes from the   I:\jlwolfe\310   folder.

Your program is to produce the same duplicates report as was produced for project #5.  Once you have the program working, you are to measure the actual execution time (from the start of file reading to the end of the duplicates report).  You are also required to measure the actual execution time for project #5 (doing the same actions) so the two approaches can be compared.  You may use the System.currentTimeMillis() method to get the current time in milliseconds to measure the elapsed execution time.

Your final requirement is to measure the height of the binary tree to see whether the analysts were right about the tree being skewed.  You will need to add at least one method to the BinarySearchTree class to accomplish this.  There are a little more than 100000 transactions in card-events.txt  so a perfectly balanced tree would have a height of about 17.  The tree can probably be regarded as skewed if the height is more than three times this.

Hand in a well-documented printout of your program and the captured output showing the duplicates report, the execution time and the height of the binary tree.  Also, hand in a printout of the duplicates report with the execution time for project 5.  Copy the java files for this project to a project 6 folder under you name in the COSC 310 portion of the P:\ drive.