13.2 XFastTrie: Searching in Doubly-Logarithmic Time

The performance of the BinaryTrie structure is not very impressive. The number of elements, $\ensuremath{\ensuremath{\mathit{n}}}$ , stored in the structure is at most $2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$ , so $\log \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}\le \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}$ . In other words, any of the comparison-based SSet structures described in other parts of this book are at least as efficient as a BinaryTrie, and are not restricted to only storing integers.

Next we describe the XFastTrie, which is just a BinaryTrie with $\ensuremath{\ensuremath{\mathit{w}}+1}$ hash tables--one for each level of the trie. These hash tables are used to speed up the $\ensuremath{\mathrm{find}(\ensuremath{\mathit{x}})}$ operation to $O(\log \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}})$ time. Recall that the $\ensuremath{\mathrm{find}(\ensuremath{\mathit{x}})}$ operation in a BinaryTrie is almost complete once we reach a node, $\ensuremath{\ensuremath{\mathit{u}}}$ , where the search path for $\ensuremath{\ensuremath{\mathit{x}}}$ would like to proceed to $\ensuremath{\ensuremath{\mathit{u}}.\ensuremath{\mathit{right}}}$ (or $\ensuremath{\ensuremath{\mathit{u}}.\ensuremath{\mathit{left}}}$ ) but $\ensuremath{\ensuremath{\mathit{u}}}$ has no right (respectively, left) child. At this point, the search uses $\ensuremath{\ensuremath{\mathit{u}}.\ensuremath{\mathit{jump}}}$ to jump to a leaf, $\ensuremath{\ensuremath{\mathit{v}}}$ , of the BinaryTrie and either return $\ensuremath{\ensuremath{\mathit{v}}}$ or its successor in the linked list of leaves. An XFastTrie speeds up the search process by using binary search on the levels of the trie to locate the node $\ensuremath{\ensuremath{\mathit{u}}}$ .

To use binary search, we need a way to determine if the node $\ensuremath{\ensuremath{\mathit{u}}}$ we are looking for is above a particular level, $\ensuremath{\ensuremath{\mathit{i}}}$ , of if $\ensuremath{\ensuremath{\mathit{u}}}$ is at or below level $\ensuremath{\ensuremath{\mathit{i}}}$ . This information is given by the highest-order $\ensuremath{\ensuremath{\mathit{i}}}$ bits in the binary representation of $\ensuremath{\ensuremath{\mathit{x}}}$ ; these bits determine the search path that $\ensuremath{\ensuremath{\mathit{x}}}$ takes from the root to level $\ensuremath{\ensuremath{\mathit{i}}}$ . For an example, refer to Figure 13.6; in this figure the last node, $\ensuremath{\ensuremath{\mathit{u}}}$ , on search path for 14 (whose binary representation is 1110) is the node labelled $11{\star\star}$ at level 2 because there is no node labelled $111{\star}$ at level 3. Thus, we can label each node at level $\ensuremath{\ensuremath{\mathit{i}}}$ with an $\ensuremath{\ensuremath{\mathit{i}}}$ -bit integer. Then, the node $\ensuremath{\ensuremath{\mathit{u}}}$ we are searching for would be at or below level $\ensuremath{\ensuremath{\mathit{i}}}$ if and only if there is a node at level $\ensuremath{\ensuremath{\mathit{i}}}$ whose label matches the highest-order $\ensuremath{\ensuremath{\mathit{i}}}$ bits of $\ensuremath{\ensuremath{\mathit{x}}}$ .

**Figure 13.6:** Since there is no node labelled $111\star$ , the search path for 14 (1110) ends at the node labelled $11{\star\star}$ .
$\includegraphics[scale=0.90909]{figs-python/xfast-path}$

In an XFastTrie, we store, for each $\ensuremath{\ensuremath{\ensuremath{\mathit{i}}}}\in\{0,\ldots,\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}\}$ , all the nodes at level $\ensuremath{\ensuremath{\mathit{i}}}$ in a USet, $\ensuremath{\ensuremath{\mathit{t}}[\ensuremath{\mathit{i}}]}$ , that is implemented as a hash table (Chapter 5). Using this USet allows us to check in constant expected time if there is a node at level $\ensuremath{\ensuremath{\mathit{i}}}$ whose label matches the highest-order $\ensuremath{\ensuremath{\mathit{i}}}$ bits of $\ensuremath{\ensuremath{\mathit{x}}}$ . In fact, we can even find this node using $\ensuremath{\ensuremath{\mathit{t}}[\ensuremath{\mathit{i}}].\mathrm{find}(\en... ...{\mathit{x}}\ensuremath{\gg}(\ensuremath{\mathit{w}}-\ensuremath{\mathit{i}})})$

The hash tables $\ensuremath{\ensuremath{\ensuremath{\mathit{t}}[0]}},\ldots,\ensuremath{\ensuremath{\ensuremath{\mathit{t}}[\ensuremath{\mathit{w}}]}}$ allow us to use binary search to find $\ensuremath{\ensuremath{\mathit{u}}}$ . Initially, we know that $\ensuremath{\ensuremath{\mathit{u}}}$ is at some level $\ensuremath{\ensuremath{\mathit{i}}}$ with $0\le \ensuremath{\ensuremath{\ensuremath{\mathit{i}}}}< \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}+1$ . We therefore initialize $\ensuremath{\ensuremath{\ensuremath{\mathit{\ell}}}}=0$ and $\ensuremath{\ensuremath{\ensuremath{\mathit{h}}}}=\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}+1$ and repeatedly look at the hash table $\ensuremath{\ensuremath{\mathit{t}}[\ensuremath{\mathit{i}}]}$ , where $\ensuremath{\ensuremath{\ensuremath{\mathit{i}}}}=\lfloor (\ensuremath{\ensuremath{\ensuremath{\mathit{\ell}}+\ensuremath{\mathit{h}}}})/2\rfloor$ . If $\ensuremath{\ensuremath{\ensuremath{\mathit{t}}[\ensuremath{\mathit{i}}]}}$ contains a node whose label matches $\ensuremath{\ensuremath{\mathit{x}}}$ 's highest-order $\ensuremath{\ensuremath{\mathit{i}}}$ bits then we set $\ensuremath{\ensuremath{\mathit{\ell}}\gets \ensuremath{i}}$ ( $\ensuremath{\ensuremath{\mathit{u}}}$ is at or below level $\ensuremath{\ensuremath{\mathit{i}}}$ ); otherwise we set $\ensuremath{\ensuremath{\mathit{h}}\gets \ensuremath{i}}$ ( $\ensuremath{\ensuremath{\mathit{u}}}$ is above level $\ensuremath{\ensuremath{\mathit{i}}}$ ). This process terminates when $\ensuremath{\ensuremath{\ensuremath{\mathit{h}}-\ensuremath{\mathit{\ell}}}}\le 1$ , in which case we determine that $\ensuremath{\ensuremath{\mathit{u}}}$ is at level $\ensuremath{\ensuremath{\mathit{\ell}}}$ . We then complete the $\ensuremath{\mathrm{find}(\ensuremath{\mathit{x}})}$ operation using $\ensuremath{\ensuremath{\mathit{u}}.\ensuremath{\mathit{jump}}}$ and the doubly-linked list of leaves.
$\begin{leftbar} \begin{flushleft} \hspace*{1em} \ensuremath{\mathrm{find}(\ensur... ...suremath{\mathit{next}}.\ensuremath{\mathit{x}}}\\ \end{flushleft}\end{leftbar}$
Each iteration of the ${\color{black} \textbf{while}}$ loop in the above method decreases $\ensuremath{\ensuremath{\mathit{h}}-\ensuremath{\mathit{\ell}}}$ by roughly a factor of two, so this loop finds $\ensuremath{\ensuremath{\mathit{u}}}$ after $O(\log \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}})$ iterations. Each iteration performs a constant amount of work and one $\ensuremath{\mathrm{find}(\ensuremath{\mathit{x}})}$ operation in a USet, which takes a constant expected amount of time. The remaining work takes only constant time, so the $\ensuremath{\mathrm{find}(\ensuremath{\mathit{x}})}$ method in an XFastTrie takes only $O(\log\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}})$ expected time.

The $\ensuremath{\mathrm{add}(\ensuremath{\mathit{x}})}$ and $\ensuremath{\mathrm{remove}(\ensuremath{\mathit{x}})}$ methods for an XFastTrie are almost identical to the same methods in a BinaryTrie. The only modifications are for managing the hash tables $\ensuremath{\ensuremath{\mathit{t}}[0]}$ ,..., $\ensuremath{\ensuremath{\mathit{t}}[\ensuremath{\mathit{w}}]}$ . During the $\ensuremath{\mathrm{add}(\ensuremath{\mathit{x}})}$ operation, when a new node is created at level $\ensuremath{\ensuremath{\mathit{i}}}$ , this node is added to $\ensuremath{\ensuremath{\mathit{t}}[\ensuremath{\mathit{i}}]}$ . During a $\ensuremath{\mathrm{remove}(\ensuremath{\mathit{x}})}$ operation, when a node is removed form level $\ensuremath{\ensuremath{\mathit{i}}}$ , this node is removed from $\ensuremath{\ensuremath{\mathit{t}}[\ensuremath{\mathit{i}}]}$ . Since adding and removing from a hash table take constant expected time, this does not increase the running times of $\ensuremath{\mathrm{add}(\ensuremath{\mathit{x}})}$ and $\ensuremath{\mathrm{remove}(\ensuremath{\mathit{x}})}$ by more than a constant factor. We omit a code listing for $\ensuremath{\mathrm{add}(\ensuremath{\mathit{x}})}$ and $\ensuremath{\mathrm{remove}(\ensuremath{\mathit{x}})}$ since the code is almost identical to the (long) code listing already provided for the same methods in a BinaryTrie.

Theorem 13..2 An XFastTrie implements the SSet interface for $\ensuremath{\ensuremath{\mathit{w}}}$ -bit integers. An XFastTrie supports the operations

$\ensuremath{\mathrm{add}(\ensuremath{\mathit{x}})}$ and $\ensuremath{\mathrm{remove}(\ensuremath{\mathit{x}})}$ in $O(\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}})$ expected time per operation and
$\ensuremath{\mathrm{find}(\ensuremath{\mathit{x}})}$ in $O(\log \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}})$ expected time per operation.

The space used by an XFastTrie that stores $\ensuremath{\ensuremath{\mathit{n}}}$ values is $O(\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}\cdot\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}})$ .