Assignment 3 Binary Search Trees Mount Royal University Winter 2022 COMP 2503 Instructor: Mehdi Afsar Programming III March 11, 2022 Assignment 3 Binary Search Trees Due Date: Monday March 25, before...

1 answer below »
Simple assignment


Assignment 3 Binary Search Trees Mount Royal University Winter 2022 COMP 2503 Instructor: Mehdi Afsar Programming III March 11, 2022 Assignment 3 Binary Search Trees Due Date: Monday March 25, before midnight Weight: 7.5% This assignment should be completed individually. Description This assignment is a reworking of the task you had to complete for assignments 1 and 2. Essentially you have the same task: read English text and print out various statistics and lists on the frequency and length of words in the text. You should be able to use much of the code you wrote for assignments 1 and 2. The behavior and output of your program is very similar to A1 and A2, with the following differences: ˆ You cannot use ArrayList or Linked List to store your Token objects, you must use a Binary Search Tree of your own implementation. ˆ You need to report on the actual and the optimal height of the binary search trees (see the instruction below). The finished program will be run like this: cat input1.txt | java -jar A3.jar > myoutput1.txt or java -jar A3.jar < input1.txt=""> myoutput1.txt I will provide sample input and output files for the program. The output from your program must be exactly as shown in the examples. It can be very difficult to eyeball your code and see if it matches the required output exactly, so you need to let the computer do the comparison for you. To check your program against a given input1.txt and ouput1.txt, try running your program as above, then diff (cat output1.txt) (cat myoutput1.txt) in Powershell or Revised: March 7, 2022 Page 1 of 5 COMP 2503 diff output1.txt myoutput1.ttxt in Linux or MacOS will tell you if your progam is producing the correct result. No output from diff means all is well. The specific requirements for this assignment are as follows. 1. Instead of a linked list you must use a BST of your own implementation. Call the class BST. 2. Use the driver class called A3 provided in the starter code package. It will have the main() method and do the bulk of the output processing. Be sure that it is instantiating an A3 object. 3. Your program should operate in five general steps. (a) Read the words from the standard input, remove all the non-alphabetic characters and convert to lower-case. Create a BST of Token objects ordered by the natural (alphabetical) ordering of the Token object. Add words in the order they appear in the text. Stop words will be dealt with differently in this exercise. Rather than skip them, you will add them into the tree just like any other word. (b) Delete the stop words from the tree. Note that there is a new list of stopwords. I have provided the Java array definition for the list. This is contrived but it means you have to implement BST.delete(). (c) Create another BST of Token objects ordered by word frequency, from most to least. For this assignment, only add words that have occurred more than twice (i.e., only add Tokens with frequency at least 3). Add the remaining words after deleting the stop words, by iterating over the original tree with in-order traversal. Note that this requirement means that your BST must be able to handle different orderings, other than just the natural ordering. Think carefully about how you will implement this! (d) Create another BST of Token objects ordered by word length from longest to shortest. This will require the creation of another Comparator for the Token class. Remember that a BST does not have dupli- cate entries, so your Comparator must distinguish between different tokens that store words of the same length. E.g., peace and faith are both 5 letter words but the Comparator must see them as not equal. If two different words are of the same length, your Comparator should order them alphabetically. E.g., faith is considered smaller than peace. (e) Print out the required results. See the sample output files for what is required. Note the following: Revised: March 7, 2022 Page 2 of 5 COMP 2503 ˆ The number of stop words is the number of stop words from the list that occured in the text. ˆ Whenever you print a Token, print the word, its length and the number of times it occured, separated by colons. ˆ Instead of the list of 10 least frequent words that was required in previous assignments, print the list of 10 longest word. You must implement and use iterators to traverse the trees (see below for further instructions). For this assignment, you should only have three trees in your system. ˆ Remember that the optimal height of a BST is log2(n) where n is the number of elements in the tree. 4. When printing items from your BST, you should do it by providing an iterator object for the BST class. Please note that you are not allowed to have any print statements in the BST class. You must provide a method public Iterator iterator() that returns an iterator over the tree. Use that to do any output or processing of elements in the tree. The main class, A3, should have no knowledge of the inner workings of the BST. In particular it should have no references to BSTNode. 5. You will find that a Queue will come in handy when implementing the iter- ator for your BST. You can use a queue class of your own implementation, or lookup how to create an object that implements the Queue interface using the built in LinkedList class in Java. 6. The BST you write must be generic. That is, it could be used in another application without change. Question: when you have completed your program and run it with the test files provided, how balanced are the trees? Stack Size With larger text files you may find yourself running out of stack space. When the Java virtual machine starts it allocates two memory areas: a stack and a heap. The stack is a typical program stack and is used to store frames — method invocation information such as parameters, local variables, and return values. Each time you perform a method call, recursive or not, a frame is placed (pushed) on the stack. It is popped when the method returns. If your program is multi-threaded, a stack is created for each thread. The heap is a run-time pool of memory and it is where object instances are created as the program runs. Whenever you execute a new the memory required to store that object is allocated from the heap. The garbage collector will Revised: March 7, 2022 Page 3 of 5 COMP 2503 periodically go through the heap and deallocate any space that is being used by an object that no longer has a reference to it. All kinds of detail about the Java Virtual machine is available from https: //docs.oracle.com/javase/specs/jvms/se13/html/index.html. Usually the default sizes for the stack and heap are sufficient but if you are getting errors you can increase them with the -X options when you start the Java virtual machine. java -Xms100M -Xmx200M -Xss50M A3 The example above will set the inital heap size (Xms100M) to 100 megabytes, the maximum heap size to 200 megabytes (Xmx200M) and the stack size to 50 megabytes (Xss50M). Warning: Before messing around with stack or heap sizes be sure that you don’t have an infinite loop or a recursive method without a base case to terminate it. Running out of memory is more likely from that than a legitimate need for more stack or heap space. All of the test files ran on my ancient Ubuntu machine in less than 10 seconds each with no change to the stack or heap size. Testing I will supply some test input and output files for you to use. These files do not constitute extensive testing. You will need to do more testing to ensure all cases are covered. I will be evaluating your program with test data that you have not yet seem and I will be trying to find flaws in your code! What to Hand In Hand in a single file, A3.jar. Submit this to the Blackboard drop box provided. The jar should contain: 1. All of your .class and .java files. 2. A class called A3. to the question above. Do Not include your test input or output files or any other cruft files. Revised: March 7, 2022 Page 4 of 5 COMP 2503 https://docs.oracle.com/javase/specs/jvms/se13/html/index.html https://docs.oracle.com/javase/specs/jvms/se13/html/index.html Grading A detailed grading rubric is provided at the end of this document. I will run your program with the command line given above on three text samples and compare your output to the specifications. Outcomes Once you have completed this assignment you will have: ˆ Implemented a BST. ˆ Compared a BST to a SLL. ˆ Understood the value of a BST and the importance of balancing the BST. I will run your program with the command line given above on three text files and compare your output to the specifications. Rubric 1. Documentation (a) Java doc standards 5 (b) Format of Code 5 (c) Meets other rules/guidelines 5 2. Testing (a) Test file1 10 (b) Test file2 10 (c) Test file3 10 3. Follows implementation specifications 30 ˆ Code is compact and efficient. ˆ Has 3 classes A3, BST, and Token. ˆ Implements and uses Comparable and Comparator objects correctly. ˆ BST class is generic and encapsulated, and implements the required methods (e.g., add, delete, findMin, height, etc.) ˆ Implements and uses Iterator interface for tree traversal. ˆ Constructs BST objects with parameterized ordering. Revised: March 7, 2022 Page 5 of 5 COMP 2503
Answered 2 days AfterMar 29, 2022

Answer To: Assignment 3 Binary Search Trees Mount Royal University Winter 2022 COMP 2503 Instructor: Mehdi...

Sandeep Kumar answered on Mar 31 2022
91 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here