A BinarySearchTree is a special kind of binary tree in which each node,
,
also stores a data value,
, from some total order. The data values in
a binary search tree obey the binary search tree property:
For
a node,
, every data value stored in the subtree rooted at
is less than
and every data value stored in the subtree rooted at
is greater than
. An example of a BinarySearchTree is shown in Figure 6.5.
The binary search tree property is extremely useful because it allows
us to quickly locate a value,
, in a binary search tree. To do this we start
searching for
at the root,
. When examining a node,
, there
are three cases:
Two examples of searches in a binary search tree are shown in
Figure 6.6. As the second example shows, even if we don't
find
in the tree, we still gain some valuable information. If we
look at the last node,
, at which Case 1 occurred, we see that
is the smallest value in the tree that is greater than
. Similarly,
the last node at which Case 2 occurred contains the largest value in the
tree that is less than
. Therefore, by keeping track of the last
node,
, at which Case 1 occurs, a BinarySearchTree can implement
the
operation that returns the smallest value stored in the
tree that is greater than or equal to
:
|
To add a new value,
, to a BinarySearchTree, we first search for
. If we find it, then there is no need to insert it. Otherwise,
we store
at a leaf child of the last node,
, encountered during the
search for
. Whether the new node is the left or right child of
depends on the result of comparing
and
.
An example is shown in Figure 6.7. The most time-consuming
part of this process is the initial search for
, which takes an
amount of time proportional to the height of the newly added node
.
In the worst case, this is equal to the height of the BinarySearchTree.
Deleting a value stored in a node,
, of a BinarySearchTree is a
little more difficult. If
is a leaf, then we can just detach
from its parent. Even better: If
has only one child, then we can
splice
from the tree by having
adopt
's child (see
Figure 6.8):
Things get tricky, though, when
has two children. In this case,
the simplest thing to do is to find a node,
, that has less than
two children and such that
can replace
. To maintain
the binary search tree property, the value
should be close to the
value of
. For example, choosing
such that
is the smallest
value greater than
will work. Finding the node
is easy; it is
the smallest value in the subtree rooted at
. This node can
be easily removed because it has no left child (see Figure 6.9).
|
The
,
, and
operations in a
BinarySearchTree each involve following a path from the root of the
tree to some node in the tree. Without knowing more about the shape of
the tree it is difficult to say much about the length of this path,
except that it is less than
, the number of nodes in the tree.
The following (unimpressive) theorem summarizes the performance of the
BinarySearchTree data structure:
Theorem 6.1 compares poorly with Theorem 4.1, which shows
that the SkiplistSSet structure can implement the SSet interface
with
expected time per operation. The problem with the
BinarySearchTree structure is that it can become unbalanced.
Instead of looking like the tree in Figure 6.5 it can look like a long
chain of
nodes, all but the last having exactly one child.
There are a number of ways of avoiding unbalanced binary search
trees, all of which lead to data structures that have
time operations. In Chapter 7 we show how
expected time operations can be achieved with randomization.
In Chapter 8 we show how
amortized
time operations can be achieved with partial rebuilding operations.
In Chapter 9 we show how
worst-case
time operations can be achieved by simulating a tree that is not binary:
one in which nodes can have up to four children.
opendatastructures.org