13. Data Structures for Integers

In this chapter, we return to the problem of implementing an $ \mathtt{SSet}$. The difference now is that we assume the elements stored in the $ \mathtt{SSet}$ are $ \mathtt{w}$-bit integers. That is, we want to implement $ \mathtt{add(x)}$, $ \mathtt{remove(x)}$, and $ \mathtt{find(x)}$ where $ \ensuremath{\mathtt{x}}\in\{0,\ldots,2^{\ensuremath{\mathtt{w}}}-1\}$. It is not too hard to think of plenty of applications where the data--or at least the key that we use for sorting the data--is an integer.

We will discuss three data structures, each building on the ideas of the previous. The first structure, the $ \mathtt{BinaryTrie}$ performs all three $ \mathtt{SSet}$ operations in $ O(\ensuremath{\mathtt{w}})$ time. This is not very impressive, since any subset of $ \{0,\ldots,2^{\ensuremath{\mathtt{w}}}-1\}$ has size $ \ensuremath{\mathtt{n}}\le 2^{\ensuremath{\mathtt{w}}}$, so that $ \log \ensuremath{\mathtt{n}} \le \ensuremath{\mathtt{w}}$. All the other $ \mathtt{SSet}$ implementations discussed in this book perform all operations in $ O(\log \ensuremath{\mathtt{n}})$ time so they are all at least as fast as a $ \mathtt{BinaryTrie}$.

The second structure, the $ \mathtt{XFastTrie}$, speeds up the search in a $ \mathtt{BinaryTrie}$ by using hashing. With this speedup, the $ \mathtt{find(x)}$ operation runs in $ O(\log \ensuremath{\mathtt{w}})$ time. However, $ \mathtt{add(x)}$ and $ \mathtt{remove(x)}$ operations in an $ \mathtt{XFastTrie}$ still take $ O(\ensuremath{\mathtt{w}})$ time and the space used by an $ \mathtt{XFastTrie}$ is $ O(\ensuremath{\mathtt{n}}\cdot\ensuremath{\mathtt{w}})$.

The third data structure, the $ \mathtt{YFastTrie}$, uses an $ \mathtt{XFastTrie}$ to store only a sample of roughly one out of every $ \ensuremath{\mathtt{w}}$ elements and stores the remaining elements in a standard $ \mathtt{SSet}$ structure. This trick reduces the running time of $ \mathtt{add(x)}$ and $ \mathtt{remove(x)}$ to $ O(\log \ensuremath{\mathtt{w}})$ and decreases the space to $ O(\ensuremath{\mathtt{n}})$.

The implementations used as examples in this chapter can store any type of data, as long as an integer can be associated with it. In the code samples, the variable $ \mathtt{ix}$ is always the integer value associated with $ \mathtt{x}$, and the method $ \mathtt{intValue(x)}$ converts $ \mathtt{x}$ to its associated integer. In the text, however, we will simply treat $ \mathtt{x}$ as if it is an integer.