Introduction |
This programming assignment is designed to ensure that you know how to write
code that processes trees, by implementing more efficient versions of the
collection classes that we have been studying.
First, we will write a priority queue where the add/remove methods are both
O(Log N) instead of one being O(1) and one O(N).
If we add/remove N values in such a priority queue, the complexity
is O(N Log N) instead of O(N^2): this gives us a way to sort in
time O(N Log N).
Second, we will write a map where the get/put methods are both O(Log N) (for a well balanced tree: most random BSTs are well balanced) instead of each being O(N) as in the array implementation (and worst case for pathological trees). When writing your classes, you will run each against my JUnit tests to verify (a bit too strong of a word here) that it is correct. You may also find it useful to test your classes with the DriverForOrderedCollection or DriverForMap, which you can run from collection.jar once you build a path to this library in the project. With this driver, you can individually test any methods in your classes interactively, and see their results (returned values and state changes, mostly using toString). Download and unzip the following Eclipse project Start and use it to start working on this program. For each part of this assignmnment you will update and turn in a single .java file in the project (see the Checkmate submission for this assignment for more details). Only one student should submit the assignment, but both student's names should appear in the comments at the top of each submitted program. Please turn in each program as you finish it, so that I can accurately assess the progress of the class as a whole during this assignment. Print this document and carefully read it, marking any parts that contain important detailed information that you find (for review while you work on this assignment and before you turn in the files). |
Problem #1: Priority Queue via Max-Heap |
In this program you will write a more efficient version of a priority queue, by
implementing it as a max-heap (maximum value at the root).
You can test this program with TestPriorityQueue, a
JUnit test (how we will ultimately grade it), and with a Supermarket
shopping simulation.
With the DriverForOrderedCollection program, you can individually test
any methods in your classes interactively, and see their results (returned
values and state changes, mostly using toString).
The HeapPriorityQueue class must implement the PriorityQueue interface using standard priority queue semantics (any-in/highest priority out). You have already implemented this behavior using linked lists. Here, the underlying data structure is a heap, which uses an array to represent a tree compactly (so it has more similarities to the priority queue implemented by an array than the one implemented by a linked list). In this representation, the index of a child node can be computed from the index of its parent node, and vice versa, by the formulas discussed in class and reviewed below. So, we should think "tree" but program "array". Because the main data structure is an array (even though we think "tree"), you should examine the code in the ArrayPriorityQueue class. Pay special attention to the toString, shallowCopy, ensureCapacity, and trimToSize methods. Here are some issues to think about when writing your class.
Finally, try running the supermarket simulation program with the ArrayPriorityQueue and your working HeapPriorityQueue. Choose 2 checkout lines: one with the default number of times (effectively infinite) and one with 6 or fewer items. Run the simulation on 5 shoppers with tracing set to true to help you understand what is happening; then run it with 50,000 or 100,000 shoppers. With 50,000 shoppers, my speedup using the HeapPriorityQueue over the ArrayPriorityQueue was from 24 seconds to 1.27 seconds. |
Problem #2: Map via BST |
In this program you will write a more efficient version of a map, by
implementing it as a binary search tree.
We have seen that for all but pathological trees (and randomly built trees are
much closer to being well-balanced than pathological), most operations
operations have a complexity of O(Log N); we can say O(tree-height) to
play it safe.
You can test this program with TestMap, a JUnit test (how we will ultimately grade it), and with an updated version of the WordGenerator program (included in the download) from Program #1 part 6, on the file huck.txt (also included in the download). With the DriverForMap program, you can individually test any methods in your classes interactively, and see their results (returned values and state changes, mostly using toString). I will supply a toString method that uses recursion to return a String that prints as a tree rotated 90 degress to the left, so you can more easily see the structure of the tree you are building. Before running the JUnit test, you must get the iterator's hasNext and next methods working correctly. Otherwise many tests will fail, because many tests use the iterator indirectly: e.g., they construct sets of values by iterating through maps; if the iterator works incorrectly, these sets will not be constructed correctly, and the JUnit test will fail for this reason, not because the operations really being tested are incorrect. The BSTMap class must implement the Map interface using standard map semantics. You should look at my ArrayMap class, which contains code that implements the same Map interface, some of which (or code like it) will belong in the BST implementation. Here, the underlying data structure is a binary search tree, whose node class is complete within the implementation, which also includes some helper methods like locate, add, and remove; these are either completley written or just their header is specified. Here are some issues to think about when writing your class.
You should also run the updated WordGenerator program, with the large text file (huck.txt) The update allows you to "comment" out code to select either the ArrayMap or BSTMap (and its appropriate comparator): it is that simple to change which data structure implements the data type. It also tracks and prints the amount of time needed to build the map: on my machine it took 18 minutes using ArrayMap and 1 second using BSTMap (a speed-up of a factor of over 1,000), when I used an order statistic of 3. Both ultimately producing a map that contains 90,705 keys with each key mapping to a list of between 1 and 46 words. |