13.1 BinaryTrie: A digital search tree

A BinaryTrie encodes a set of $ \ensuremath{\ensuremath{\mathit{w}}}$ bit integers in a binary tree. All leaves in the tree have depth $ \ensuremath{\ensuremath{\mathit{w}}}$ and each integer is encoded as a root-to-leaf path. The path for the integer $ \ensuremath{\ensuremath{\mathit{x}}}$ turns left at level $ \ensuremath{\ensuremath{\mathit{i}}}$ if the $ \ensuremath{\ensuremath{\mathit{i}}}$th most significant bit of $ \ensuremath{\ensuremath{\mathit{x}}}$ is a 0 and turns right if it is a 1. Figure 13.1 shows an example for the case $ \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}=4$, in which the trie stores the integers 3(0011), 9(1001), 12(1100), and 13(1101).

Figure 13.1: The integers stored in a binary trie are encoded as root-to-leaf paths.
\includegraphics[width=\textwidth ]{figs-python/binarytrie-ex-1}

Because the search path for a value $ \ensuremath{\ensuremath{\mathit{x}}}$ depends on the bits of $ \ensuremath{\ensuremath{\mathit{x}}}$, it will be helpful to name the children of a node, $ \ensuremath{\ensuremath{\mathit{u}}}$, $ \ensuremath{\ensuremath{\mathit{u}}.\ensuremath{\mathit{child}}[0]}$ ( $ \ensuremath{\ensuremath{\mathit{left}}}$) and $ \ensuremath{\ensuremath{\mathit{u}}.\ensuremath{\mathit{child}}[1]}$ ( $ \ensuremath{\ensuremath{\mathit{right}}}$). These child pointers will actually serve double-duty. Since the leaves in a binary trie have no children, the pointers are used to string the leaves together into a doubly-linked list. For a leaf in the binary trie $ \ensuremath{\ensuremath{\mathit{u}}.\ensuremath{\mathit{child}}[0]}$ ( $ \ensuremath{\ensuremath{\mathit{prev}}}$) is the node that comes before $ \ensuremath{\ensuremath{\mathit{u}}}$ in the list and $ \ensuremath{\ensuremath{\mathit{u}}.\ensuremath{\mathit{child}}[1]}$ ( $ \ensuremath{\ensuremath{\mathit{next}}}$) is the node that follows $ \ensuremath{\ensuremath{\mathit{u}}}$ in the list. A special node, $ \ensuremath{\ensuremath{\mathit{dummy}}}$, is used both before the first node and after the last node in the list (see Section 3.2).

Each node, $ \ensuremath{\ensuremath{\mathit{u}}}$, also contains an additional pointer $ \ensuremath{\ensuremath{\mathit{u}}.\ensuremath{\mathit{jump}}}$. If $ \ensuremath{\ensuremath{\mathit{u}}}$'s left child is missing, then $ \ensuremath{\ensuremath{\mathit{u}}.\ensuremath{\mathit{jump}}}$ points to the smallest leaf in $ \ensuremath{\ensuremath{\mathit{u}}}$'s subtree. If $ \ensuremath{\ensuremath{\mathit{u}}}$'s right child is missing, then $ \ensuremath{\ensuremath{\mathit{u}}.\ensuremath{\mathit{jump}}}$ points to the largest leaf in $ \ensuremath{\ensuremath{\mathit{u}}}$'s subtree. An example of a BinaryTrie, showing $ \ensuremath{\ensuremath{\mathit{jump}}}$ pointers and the doubly-linked list at the leaves, is shown in Figure 13.2.

Figure 13.2: A BinaryTrie with $ \ensuremath{\ensuremath{\mathit{jump}}}$ pointers shown as curved dashed edges.
\includegraphics[width=\textwidth ]{figs-python/binarytrie-ex-2}

The $ \ensuremath{\mathrm{find}(\ensuremath{\mathit{x}})}$ operation in a BinaryTrie is fairly straightforward. We try to follow the search path for $ \ensuremath{\ensuremath{\mathit{x}}}$ in the trie. If we reach a leaf, then we have found $ \ensuremath{\ensuremath{\mathit{x}}}$. If we reach a node $ \ensuremath{\ensuremath{\mathit{u}}}$ where we cannot proceed (because $ \ensuremath{\ensuremath{\mathit{u}}}$ is missing a child), then we follow $ \ensuremath{\ensuremath{\mathit{u}}.\ensuremath{\mathit{jump}}}$, which takes us either to the smallest leaf larger than $ \ensuremath{\ensuremath{\mathit{x}}}$ or the largest leaf smaller than $ \ensuremath{\ensuremath{\mathit{x}}}$. Which of these two cases occurs depends on whether $ \ensuremath{\ensuremath{\mathit{u}}}$ is missing its left or right child, respectively. In the former case ( $ \ensuremath{\ensuremath{\mathit{u}}}$ is missing its left child), we have found the node we want. In the latter case ( $ \ensuremath{\ensuremath{\mathit{u}}}$ is missing its right child), we can use the linked list to reach the node we want. Each of these cases is illustrated in Figure 13.3.
\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{find}(\ensur...
...\ensuremath{\mathit{u}}.\ensuremath{\mathit{x}}}\\
\end{flushleft}\end{leftbar}

Figure 13.3: The paths followed by $ \ensuremath{\mathrm{find}(5)}$ and $ \ensuremath{\mathrm{find}(8)}$.
\includegraphics[width=\textwidth ]{figs-python/binarytrie-ex-3}
The running-time of the $ \ensuremath{\mathrm{find}(\ensuremath{\mathit{x}})}$ method is dominated by the time it takes to follow a root-to-leaf path, so it runs in $ O(\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}})$ time.

The $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{x}})}$ operation in a BinaryTrie is also fairly straightforward, but has a lot of work to do:

  1. It follows the search path for $ \ensuremath{\ensuremath{\mathit{x}}}$ until reaching a node $ \ensuremath{\ensuremath{\mathit{u}}}$ where it can no longer proceed.
  2. It creates the remainder of the search path from $ \ensuremath{\ensuremath{\mathit{u}}}$ to a leaf that contains $ \ensuremath{\ensuremath{\mathit{x}}}$.
  3. It adds the node, $ \ensuremath{u}'$, containing $ \ensuremath{\ensuremath{\mathit{x}}}$ to the linked list of leaves (it has access to the predecessor, $ \ensuremath{\ensuremath{\mathit{pred}}}$, of $ \ensuremath{u}'$ in the linked list from the $ \ensuremath{\ensuremath{\mathit{jump}}}$ pointer of the last node, $ \ensuremath{\ensuremath{\mathit{u}}}$, encountered during step 1.)
  4. It walks back up the search path for $ \ensuremath{\ensuremath{\mathit{x}}}$ adjusting $ \ensuremath{\ensuremath{\mathit{jump}}}$ pointers at the nodes whose $ \ensuremath{\ensuremath{\mathit{jump}}}$ pointer should now point to $ \ensuremath{\ensuremath{\mathit{x}}}$.
An addition is illustrated in Figure 13.4.
Figure 13.4: Adding the values 2 and 15 to the BinaryTrie in Figure 13.2.
\includegraphics[width=\textwidth ]{figs-python/binarytrie-add}

\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{add}(\ensure...
...return}} \ensuremath{\ensuremath{\mathit{true}}}\\
\end{flushleft}\end{leftbar}
This method performs one walk down the search path for $ \ensuremath{\ensuremath{\mathit{x}}}$ and one walk back up. Each step of these walks takes constant time, so the $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{x}})}$ method runs in $ O(\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}})$ time.

The $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{x}})}$ operation undoes the work of $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{x}})}$. Like $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{x}})}$, it has a lot of work to do:

  1. It follows the search path for $ \ensuremath{\ensuremath{\mathit{x}}}$ until reaching the leaf, $ \ensuremath{\ensuremath{\mathit{u}}}$, containing $ \ensuremath{\ensuremath{\mathit{x}}}$.
  2. It removes $ \ensuremath{\ensuremath{\mathit{u}}}$ from the doubly-linked list.
  3. It deletes $ \ensuremath{\ensuremath{\mathit{u}}}$ and then walks back up the search path for $ \ensuremath{\ensuremath{\mathit{x}}}$ deleting nodes until reaching a node $ \ensuremath{\ensuremath{\mathit{v}}}$ that has a child that is not on the search path for $ \ensuremath{\ensuremath{\mathit{x}}}$.
  4. It walks upwards from $ \ensuremath{\ensuremath{\mathit{v}}}$ to the root updating any $ \ensuremath{\ensuremath{\mathit{jump}}}$ pointers that point to $ \ensuremath{\ensuremath{\mathit{u}}}$.
A removal is illustrated in Figure 13.5.
Figure 13.5: Removing the value 9 from the BinaryTrie in Figure 13.2.
\includegraphics[scale=0.90909]{figs-python/binarytrie-remove}

\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{remove}(\ens...
...return}} \ensuremath{\ensuremath{\mathit{true}}}\\
\end{flushleft}\end{leftbar}

Theorem 13..1   A BinaryTrie implements the SSet interface for $ \ensuremath{\ensuremath{\mathit{w}}}$-bit integers. A BinaryTrie supports the operations $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{x}})}$, $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{x}})}$, and $ \ensuremath{\mathrm{find}(\ensuremath{\mathit{x}})}$ in $ O(\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}})$ time per operation. The space used by a BinaryTrie that stores $ \ensuremath{\ensuremath{\mathit{n}}}$ values is $ O(\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}\cdot\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}})$.

opendatastructures.org