due: Thursday April 24 at 11:59pm (submit on Gradescope here).
In this assignment you'll start by using Java
's internal TreeMap
to solve a problem (Part 1), and then replace the use of the TreeMap
with your own binary search tree called DIYTreeMap
(Part 2). Finally, you'll balance your own DIYTreeMap
so that it is an AVL Tree (Part 3).
Note that solving Part 1 is worth 10 points and Part 2 is worth 8 points. Part 3 is challenging and is only worth 2 points. Be sure to test your code after each part.
The starter code can be downloaded here.
TreeMap
[10 points]Your main task is to find the words in some text
that occur more than twice. Plural words are considered distinct from their singular versions. For example, shipping ship shipping shipping ships
has a single word that occurs 3 times, and two other words that each appear once. Complete the getRepeatedWords
(static) method to return an ArrayList<String>
with the words that appear more than twice (i.e. at least three times). In this first example, the result would be [shipping]
.
Assume the input text
is all in lower-case.
You can split text
into individual words by saving the result of text.split(" ")
to an array of String
s called words
.
Start by using Java
's TreeMap<String, Integer>
to find the frequency of each word. Then find all words that occur more than twice, returned as an ArrayList<String>
of words. These words don't need to be in any particular order. Hint: recall in Lecture 3W (slide 14) we used a HashMap
to construct a frequency map. You can do something very similar here.
You can use the importText
method to load the pale-blue-dot.txt
file to test your algorithm. This file contains some text from Carl Sagan's book called "Pale Blue Dot". You can also try your method on other files! Maybe song lyrics? (though this might require generalizing how punctuation is processed).
TreeMap
with DIYTreeMap
[8 points]Replace all use of Java
's TreeMap
with your own binary search tree (BST) called DIYTreeMap
(not necessarily balanced). You'll need to implement any BST methods your algorithm from Part 1 needs. At the very least, you should provide the following methods (so your DIYTreeMap
can be tested on Gradescope):
value
(if any) similar to what Oracle says the TreeMap
put
method should be, but void
is okay for our purposes.value
at the Node
with a particular key
. Returns null
if the key
is not in the tree.One of the test cases uses the following tree (the nodes are printed as (key:value)
):
└──(10:0)
│ └──(5:1)
│ │ └──(2:3)
│ │ │ └──(3:7)
│ │ └──(9:4)
│ └──(15:2)
│ │ └──(12:5)
│ │ └──(20:6)
│ │ │ └──(17:8)
│ │ │ │ └──(18:10)
│ │ │ └──(30:9)
│ │ │ │ └──(25:11)
In this example, the value
corresponds to the order in which a key-value pair is added. This tree is not balanced (by the balancing requirements of an AVL tree), but the tree structure doesn't matter for the tests (it will only test if a key-value pair exists in the tree).
DIYTreeMap
[2 points]Balance your tree map as necessary whenever an insertion is made (via put
) to maintain an AVL tree. Review the notes from Lecture 9W for how this should be done. It would also be a good idea to review the worksheet from Lab 7, and also try the optional RotateRight
puzzle. This might help with implementing the rotateRight
and rotateLeft
methods.
Use the steps from slide 14 in the 9W notes to determine which type of rotation should be performed (if any) after inserting a key-value pair into the map (via put
).
One of the tests will add the nodes of the tree described in Part 2 in the order of each node value
. Note that this should produce the same tree structure obtained from the insertions in Lab 7 (before removing the node with key 9
).
Upload DIYTreeMap.java
to Gradescope. Pre-submission check-list:
javadoc
-style documentation for public methods you implement;import java.util.TreeMap;
at the top of the file after completing Part 2 or 3.