A ScapegoatTree is a BinarySearchTree that, in addition to keeping track of the number, , of nodes in the tree also keeps a counter, , that maintains an upper-bound on the number of nodes.
int q;At all times, and obey the following inequalities:
Implementing the operation in a ScapegoatTree is done using the standard algorithm for searching in a BinarySearchTree (see ). This takes time proportional to the height of the tree which, by () is .
To implement the operation, we first increment and and then use the usual algorithm for adding to a binary search tree; we search for and then add a new leaf with . At this point, we may get lucky and the depth of might not exceed . If so, then we leave well enough alone and don't do anything else.
Unfortunately, it will sometimes happen that . In this case we need to do something to reduce the height. This isn't a big job; there is only one node, namely , whose depth exceeds . To fix , we walk from back up to the root looking for a scapegoat, . The scapegoat, , is a very unbalanced node. It has the property that
boolean add(T x) { // first do basic insertion keeping track of depth Node<T> u = newNode(x); int d = addWithDepth(u); if (d > log32(q)) { // depth exceeded, find scapegoat Node<T> w = u.parent; while (3*size(w) <= 2*size(w.parent)) w = w.parent; rebuild(w.parent); } return d >= 0; }
|
The implementation of in a ScapegoatTree is very simple. We search for and remove it using the usual algorithm for removing a node from a BinarySearchTree. (Note that this can never increase the height of the tree.) Next, we decrement but leave unchanged. Finally, we check if and, if so, we rebuild the entire tree into a perfectly balanced binary search tree and set .
boolean remove(T x) { if (super.remove(x)) { if (2*n < q) { rebuild(r); q = n; } return true; } return false; }Again, if we ignore the cost of rebuilding, the running time of the operation is proportional to the height of the tree, and is therefore .
In this section we analyze the correctness and amortized running time of operations on a ScapegoatTree. We first prove the correctness by showing that, when the operation results in a node that violates Condition (), then we can always find a scapegoat:
Next, we analyze the parts of the running time that we have not yet accounted for. There are two parts: The cost of calls to when search for scapegoat nodes, and the cost of calls to when we find a scapegoat . The cost of calls to can be related to the cost of calls to , as follows:
All that remains is to prove an upper-bound on the cost of calls to :
During an insertion or deletion, we give one credit to each node on the path to the inserted node, or deleted node, . In this way we hand out at most credits per operation. During a deletion we also store an additional 1 credit ``on the side.'' Thus, in total we give out at most credits. All that remains is to show that these credits are sufficient to pay for all calls to .
If we call during an insertion, it is because is a scapegoat. Suppose, without loss of generality, that
If we call during a deletion, it is because . In this case, we have credits stored ``on the side'' and we use these to pay for the time it takes to rebuild the root. This completes the proof.
Furthermore, beginning with an empty ScapegoatTree, any sequence of and operations results in a total of time spent during all calls to .
opendatastructures.org