AVL-g tree requirements

In Part 2, you will be asked to consider replacing the treemap of city names which is used as an index to the city objects with some other structure. For this semester of 420, the ``other structure'' will be a modified AVL tree. You are all familiar with the traditional definition of an AVL tree, or if not, you can become so:

The AVL tree (named for its inventors Adelson-Velskii and Landis) should be viewed as a BST with the following additional property: For every node, the heights of its left and right subtrees differ by at most 1.

This is Shaffer's definition for an AVL tree; we modify the definition in one very simple way. Our AVL-g tree takes a parameter on creation--the so-named g parameter--which refers to the maximum height by which the subtrees differ. Apart from this, our trees are identical in implementation to the conceptual trees given by Shaffer; we rebalance the tree using identical rotations to the basic tree, both single and double. You can find ample resources online for AVL trees; the problem as we pose it is in adapting the resources you can find for our AVL-g trees.

Astute readers will note that for any $ k < g$, any AVL-k tree is a valid AVL-g tree. Thus, since a regular AVL tree is just an AVL-1 tree by our definition, you could just grab source for an AVL from the internet and point out that it's a valid AVL-g tree for any $ g$. However, we're not satisfied with such trickery. Note that given a number g and an ordered input set K, we can always find some number m which is the maximal balance factor found somewhere in the tree. For example, consider an AVL-1 tree with two nodes. It is always true that such a tree has at least one node with balance factor 1--the tree can never be balanced better than that. Consider similarly an AVL-2 tree with three nodes, where the nodes were inserted in sorted order. See a pattern? Of course it's trivial to find the pattern for $ g+1$ nodes in sorted order, but if you want a fun challenge, try and come up with a way of finding $ m$ for any arbitrary input set and $ g$. Anyway, our tests will use this property--all of our tests verify that the maximal balance factor of the tree is greater than $ m$ but less than $ g$.

However, since we're using Java and a fundamental feature of Java is its interfaces, your code will be required not only to properly implement the AVL-g specifications, but also to implement Java's SortedMap interface. This will allow you to fully replace the TreeMap with your SortedMap as you desire. Your AVL tree should be generic just like the TreeMap, and it should eventually implement every one of the SortedMap methods except headMap(), tailMap(), keySet() and values(). However, we will not test remove() until Part3. So, you can choose to delay it until after you've finished Part2.

But, since you don't implement remove in Part 2, this prevents you from properly implementing subMap().clear() and related functions. Therefore, your behavior for subMap().clear() won't be tested--but we advise that at this point you forward to AvlGTree.this.clear(). If that sounded confusing to you, don't worry about it for now, but come back to it later. If you would like guidance on how to begin implementing SortedMap, there are two resources to which you can turn: we provide you with an example, SortedMapExample.jar, which shows a SkipList that has been used to implement SortedMap. You can find SortedMapExample.jar in the top level directory of the course files. Similarly, you can easily find the source code for Java's TreeMap online; Josh Bloch's implementation strategies may prove useful and relevant to your code. As always, if you take inspiration or structure from either of these sources please cite that in your README file.

Any portion of the TreeMap that uses the red-black tree explicitly (or the skiplist version) must be modified to use the AVL-G tree instead. Failure to do this will result in a deduction of the points that you tricked the submit server into giving you. It will not be treated as an honor code violation since the red-black tree is not the most intuitive creature that you will ever meet.

Although this probably won't be a problem for AVL trees as much as some of the structures in the past, we expressly forbid you from actually using a skiplist or other complex data type inside your AVL tree code. That is, although you may be inspired (or more) by the code you see in TreeMap or SkipList, you may not actually use a TreeMap or SkipList in your implementation. The same goes for TreeSet (which, interestingly, is just a wrapper around a TreeMap--cool trivia!), HashSet or HashMap; basically, if you're using a structure that you feel probably duplicates the functionality of your SortedMap, you should not be using it.

We now proceed into a formal definition of an AVL-g tree. In order to do this, we must define the height of a subtree--we do so recursively, saying that a leaf node has height 0, and an internal node has a height equal to the height of its tallest child, plus one. We define the height of an empty tree to be -1.

Thus, an AVL-g tree is defined as a binary search tree where at any node, the height of its two subtrees (left and right) differ by at most a positive constant g. The constant is given at instantiation, as a parameter to the constructor. Although quite a bit of literature exists on maintaining the balance factor in an AVL tree, our tests only require that your tree maintain the above property, as well as obviously retaining any key-value mappings passed into it. Note: do not waste time searching for additional information regarding the AVL-G tree because there is none. This structure is unique to MeeshQuest.

These rules ARE mandatory, meaning that no credit will be given for code for which these rules are not observed. Obviously, there can be multiple AVL trees of a given order that are correct. Since grading by merely diff-ing is impossible, we will grade your tree by checking that the structure you produce satisfies the above properties.



Subsections
MM Hugue 2018-10-01