6.3 Discussion and Exercises

Binary trees have been used to model relationships for literally thousands of years. One reason for this is that binary trees naturally model (pedigree) family trees. These are the family trees in which the root is a person, the left and right children are the person's parents, and so on, recursively. In more recent centuries binary trees have also been used to model species-trees in biology, where the leaves of the tree represent extant species and the internal nodes of the tree represent speciation events in which two populations of a single species evolve into two separate species.

Binary search trees appear to have been discovered independently by several groups in the 1950s [48, Section 6.2.2]. Further references to specific kinds of binary search trees are provided in subsequent chapters.

When implementing a binary tree from scratch, there are several design decisions to be made. One of these is the question of whether or not each node stores a pointer to its parent. If most of the operations simply follow a root-to-leaf path, then parent pointers are unnecessary, and are a potential source of coding errors. On the other hand, the lack of parent pointers means that tree traversals must be done recursively or with the use of an explicit stack. Some other methods (like inserting or deleting into some kinds of balanced binary search trees) are also complicated by the lack of parent pointers.

Another design decision is concerned with how to store the parent, left child, and right child pointers at a node. In the implementation given here, they are stored as separate variables. Another option is to store them in an array, $\mathtt{p}$ , of length 3, so that $\mathtt{u.p[0]}$ is the left child of $\mathtt{u}$ , $\mathtt{u.p[1]}$ is the right child of $\mathtt{u}$ , and $\mathtt{u.p[2]}$ is the parent of $\mathtt{u}$ . Using an array this way means that some sequences of if statements can be simplified into algebraic expressions.

An example of such a simplification occurs during tree traversal. If a traversal arrives at a node $\mathtt{u}$ from $\mathtt{u.p[i]}$ , then the next node in the traversal is $\ensuremath{\mathtt{u.p}}[(\ensuremath{\mathtt{i}}+1)\bmod 3]$ . Similar examples occur when there is left-right symmetry. For example, the sibling of $\mathtt{u.p[i]}$ is $\ensuremath{\mathtt{u.p}}[(\ensuremath{\mathtt{i}}+1)\bmod 2]$ . This works whether $\mathtt{u.p[i]}$ is a left child ( $\ensuremath{\mathtt{i}}=0$ ) or a right child ( $\ensuremath{\mathtt{i}}=1$ ) of $\mathtt{u}$ . In some cases this means that some complicated code that would otherwise need to have both a left version and right version can be written only once. See the methods $\mathtt{rotateLeft(u)}$ and $\mathtt{rotateRight(u)}$ on page

for an example.

Exercise 6..1 Prove that a binary tree having $\ensuremath{\mathtt{n}}\ge 1$ nodes has $\ensuremath{\mathtt{n}}-1$ edges.

Exercise 6..2 Prove that a binary tree having $\ensuremath{\mathtt{n}}\ge 1$ real nodes has $\ensuremath{\mathtt{n}}+1$ external nodes.

Exercise 6..3 Prove that, if a binary tree,

, has at least one leaf, then either (a)

's root has at most one child or (b)

has more than one leaf.

Exercise 6..4 Write a non-recursive variant of the $\mathtt{size2()}$ method, $\mathtt{size(u)}$ , that computes the size of the subtree rooted at node $\mathtt{u}$ .

Exercise 6..5 Write a non-recursive method, $\mathtt{height2(u)}$ , that computes the height of node $\mathtt{u}$ in a BinaryTree.

Exercise 6..6 A binary tree is balanced if, for every node $\mathtt{u}$ , the size of the subtrees rooted at $\mathtt{u.left}$ and $\mathtt{u.right}$ differ by at most one. Write a recursive method, $\mathtt{isBalanced()}$ , that tests if a binary tree is balanced. Your method should run in $O(\ensuremath{\mathtt{n}})$ time. (Be sure to test your code on some large trees with different shapes; it is easy to write a method that takes much longer than $O(\ensuremath{\mathtt{n}})$ time.)

A pre-order traversal of a binary tree is a traversal that visits each node, $\mathtt{u}$ , before any of its children. An in-order traversal visits $\mathtt{u}$ after visiting all the nodes in $\mathtt{u}$ 's left subtree but before visiting any of the nodes in $\mathtt{u}$ 's right subtree. A post-order traversal visits $\mathtt{u}$ only after visiting all other nodes in $\mathtt{u}$ 's subtree. The pre/in/post-order numbering of a tree labels the nodes of a tree with the integers $0,\ldots,\ensuremath{\mathtt{n}}-1$ in the order that they are encountered by a pre/in/post-order traversal. See Figure 6.10 for an example.

**Figure 6.10:** Pre-order, post-order, and in-order numberings of a binary tree.
$\includegraphics{figs/binarytree-numbering-1}$ $\includegraphics{figs/binarytree-numbering-2}$ $\includegraphics{figs/binarytree-numbering-3}$

Exercise 6..7 Create a subclass of BinaryTree whose nodes have fields for storing pre-order, post-order, and in-order numbers. Write recursive methods $\mathtt{preOrderNumber()}$ , $\mathtt{inOrderNumber()}$ , and $\mathtt{postOrderNumbers()}$ that assign these numbers correctly. These methods should each run in $O(\ensuremath{\mathtt{n}})$ time.

Exercise 6..8 Write non-recursive functions $\mathtt{nextPreOrder(u)}$ , $\mathtt{nextInOrder(u)}$ , and $\mathtt{nextPostOrder(u)}$ that return the node that follows $\mathtt{u}$ in a pre-order, in-order, or post-order traversal, respectively. These functions should take amortized constant time; if we start at any node $\mathtt{u}$ and repeatedly call one of these functions and assign the return value to $\mathtt{u}$ until $\ensuremath{\mathtt{u}}=\ensuremath{\mathtt{null}}$ , then the cost of all these calls should be $O(\ensuremath{\mathtt{n}})$ .

Exercise 6..9 Suppose we are given a binary tree with pre- post- and in-order numbers assigned to the nodes. Show how these numbers can be used to answer each of the following questions in constant time:

Given a node $\mathtt{u}$ , determine the size of the subtree rooted at $\mathtt{u}$ .
Given a node $\mathtt{u}$ , determine the depth of $\mathtt{u}$ .
Given two nodes $\mathtt{u}$ and $\mathtt{w}$ , determine if $\mathtt{u}$ is an ancestor of $\mathtt{w}$

Exercise 6..10 Suppose you are given a list of nodes with pre-order and in-order numbers assigned. Prove that there is at most one possible tree with this pre-order/in-order numbering and show how to construct it.

Exercise 6..11 Show that the shape of any binary tree on $\mathtt{n}$ nodes can be represented using at most $2(\ensuremath{\mathtt{n}}-1)$ bits. (Hint: think about recording what happens during a traversal and then playing back that recording to reconstruct the tree.)

Exercise 6..12 Illustrate what happens when we add the values

and then 4.5 to the binary search tree in Figure 6.5.

Exercise 6..13 Illustrate what happens when we remove the values

and then 5 from the binary search tree in Figure 6.5.

Exercise 6..14 Design and implement a method BinarySearchTree method $\mathtt{getLE(x)}$ , that returns a list of all items in the tree that are less than or equal to $\mathtt{x}$ . The running time of your method should be $O(\ensuremath{\mathtt{n}}'+\ensuremath{\mathtt{h}})$ where $\ensuremath{\mathtt{n}}'$ is the number of items less than or equal to $\mathtt{x}$ and $\mathtt{h}$ is the height of the tree.

Exercise 6..15 Describe how to add the elements $\{1,\ldots,\ensuremath{\mathtt{n}}\}$ to an initially empty BinarySearchTree in such a way that the resulting tree has height $\ensuremath{\mathtt{n}}-1$ . How many ways are there to do this?

Exercise 6..16 If we have some BinarySearchTree and perform the operations $\mathtt{add(x)}$ followed by $\mathtt{remove(x)}$ (with the same value of $\mathtt{x}$ ) do we necessarily return to the original tree?

Exercise 6..17 Can a $\mathtt{remove(x)}$ operation increase the height of any node in a BinarySearchTree? If so, by how much?

Exercise 6..18 Can an $\mathtt{add(x)}$ operation increase the height of any node in a BinarySearchTree? Can it increase the height of the tree? If so, by how much?

Exercise 6..19 Design and implement a version of BinarySearchTree in which each node, $\mathtt{u}$ , maintains values $\mathtt{u.size}$ (the size of the subtree rooted at $\mathtt{u}$ ), $\mathtt{u.depth}$ (the depth of $\mathtt{u}$ ), and $\mathtt{u.height}$ (the height of the subtree rooted at $\mathtt{u}$ ).

These values should be maintained, even during the $\mathtt{add(x)}$ and $\mathtt{remove(x)}$ operations, but this should not increase the cost of these operations by more than a constant factor.