13.2 : Searching in Doubly-Logarithmic Time

The performance of the $\mathtt{BinaryTrie}$ structure is not that impressive. The number of elements, $\mathtt{n}$ , stored in the structure is at most $2^{\ensuremath{\mathtt{w}}}$ , so $\log \ensuremath{\mathtt{n}} \le \ensuremath{\mathtt{w}}$ . In other words, any of the comparison-based $\mathtt{SSet}$ structures described in other parts of this book are at least as efficient as a $\mathtt{BinaryTrie}$ , and don't have the restriction of only being able to store integers.

Next we describe the $\mathtt{XFastTrie}$ , which is just a $\mathtt{BinaryTrie}$ with $\mathtt{w+1}$ hash tables--one for each level of the trie. These hash tables are used to speed up the $\mathtt{find(x)}$ operation to $O(\log \ensuremath{\mathtt{w}})$ time. Recall that the $\mathtt{find(x)}$ operation in a $\mathtt{BinaryTrie}$ is almost complete once we reach a node, $\mathtt{u}$ , where the search path for $\mathtt{x}$ would like to proceed to $\mathtt{u.right}$ (or $\mathtt{u.left}$ ) but $\mathtt{u}$ has no right (respectively, left) child. At this point, the search uses $\mathtt{u.jump}$ to jump to a leaf, $\mathtt{v}$ , of the $\mathtt{BinaryTrie}$ and either return $\mathtt{v}$ or its successor in the linked list of leaves. An $\mathtt{XFastTrie}$ speeds up the search process by using binary search on the levels of the trie to locate the node $\mathtt{u}$ .

To use binary search, we need a way to determine if the node $\mathtt{u}$ we are looking for is above a particular level, $\mathtt{i}$ , of if $\mathtt{u}$ is at or below level $\mathtt{i}$ . This information is given by the highest-order $\mathtt{i}$ bits in the binary representation of $\mathtt{x}$ ; these bits determine the search path that $\mathtt{x}$ takes from the root to level $\mathtt{i}$ . For an example, refer to Figure 13.6; in this figure the last node, $\mathtt{u}$ , on search path for 14 (whose binary representation is 1110) is the node labelled $11{\star \star }$ at level 2 because there is no node labelled $111{\star}$ at level 3. Thus, we can label each node at level $\mathtt{i}$ with an $\mathtt{i}$ -bit integer. Then the node $\mathtt{u}$ we are searching for is at or below level $\mathtt{i}$ if and only if there is a node at level $\mathtt{i}$ whose label matches the highest-order $\mathtt{i}$ bits of $\mathtt{x}$ .

**Figure 13.6:** The search path for 14 (1110) ends at the node labelled $11{\star \star }$ since there is no node labelled $111\star$ .
$\includegraphics{figs/xfast-path}$

In an $\mathtt{XFastTrie}$ , we store, for each $\ensuremath{\mathtt{i}}\in\{0,\ldots,\ensuremath{\mathtt{w}}\}$ , all the nodes at level $\mathtt{i}$ in a $\mathtt{USet}$ , $\mathtt{t[i]}$ , that is implemented as a hash table (Chapter 5). Using this $\mathtt{USet}$ allows us to check in constant expected time if there is a node at level $\mathtt{i}$ whose label matches the highest-order $\mathtt{i}$ bits of $\mathtt{x}$ . In fact, we can even find this node using $\mathtt{t[i].find(x>>(w-i))}$ .

The hash tables $\ensuremath{\mathtt{t[0]}},\ldots,\ensuremath{\mathtt{t[w]}}$ allow us to use binary search to find $\mathtt{u}$ . Initially, we know that $\mathtt{u}$ is at some level $\mathtt{i}$ with $0\le \ensuremath{\mathtt{i}}< \ensuremath{\mathtt{w}}+1$ . We therefore initialize $\ensuremath{\mathtt{l}}=0$ and $\ensuremath{\mathtt{h}}=\ensuremath{\mathtt{w}}+1$ and repeatedly look at the hash table $\mathtt{t[i]}$ , where $\ensuremath{\mathtt{i}}=\lfloor (\ensuremath{\mathtt{l+h}})/2\rfloor$ . If $\ensuremath{\mathtt{t[i]}}$ contains a node whose label matches $\mathtt{x}$ 's highest-order $\mathtt{i}$ bits then we set $\mathtt{l=i}$ ( $\mathtt{u}$ is at or below level $\mathtt{i}$ ), otherwise we set $\mathtt{h=i}$ ( $\mathtt{u}$ is above level $\mathtt{i}$ ). This process terminates when $\ensuremath{\mathtt{h-l}}\le 1$ , in which case we determine that $\mathtt{u}$ is at level $\mathtt{l}$ . We then complete the $\mathtt{find(x)}$ operation using $\mathtt{u.jump}$ and the doubly-linked list of leaves.

The $\mathtt{add(x)}$ and $\mathtt{remove(x)}$ methods for an $\mathtt{XFastTrie}$ are almost identical to the same methods in a $\mathtt{BinaryTrie}$ . The only modifications are for managing the hash tables $\mathtt{t[0]}$ ,..., $\mathtt{t[w]}$ . During the $\mathtt{add(x)}$ operation, when a new node is created at level $\mathtt{i}$ , this node is added to $\mathtt{t[i]}$ . During a $\mathtt{remove(x)}$ operation, when a node is removed form level $\mathtt{i}$ , this node is removed from $\mathtt{t[i]}$ . Since adding and removing from a hash table take constant expected time, this does not increase the running times of $\mathtt{add(x)}$ and $\mathtt{remove(x)}$ by more than a constant factor. We omit a code listing or $\mathtt{add(x)}$ and $\mathtt{remove(x)}$ since it is almost identical to the (long) code listing already provided for the same methods in a $\mathtt{BinaryTrie}$ .

Theorem 13..2 An $\mathtt{XFastTrie}$ implements the $\mathtt{SSet}$ interface for $\mathtt{w}$ -bit integers. An $\mathtt{XFastTrie}$ supports the operations

$\mathtt{add(x)}$ and $\mathtt{remove(x)}$ in $O(\ensuremath{\mathtt{w}})$ expected time per operation and
$\mathtt{find(x)}$ in $O(\log \ensuremath{\mathtt{w}})$ expected time per operation.

The space used by an $\mathtt{XFastTrie}$ that stores $\mathtt{n}$ values is $O(\ensuremath{\mathtt{n}}\cdot\ensuremath{\mathtt{w}})$ .