13. Data Structures for Integers

In this chapter, we return to the problem of implementing an SSet. The difference now is that we assume the elements stored in the SSet are $ \mathtt{w}$-bit integers. That is, we want to implement $ \mathtt{add(x)}$, $ \mathtt{remove(x)}$, and $ \mathtt{find(x)}$ where $ \ensuremath{\mathtt{x}}\in\{0,\ldots,2^{\ensuremath{\mathtt{w}}}-1\}$. It is not too hard to think of plenty of applications where the data--or at least the key that we use for sorting the data--is an integer.

We will discuss three data structures, each building on the ideas of the previous. The first structure, the BinaryTrie performs all three SSet operations in $ O(\ensuremath{\mathtt{w}})$ time. This is not very impressive, since any subset of $ \{0,\ldots,2^{\ensuremath{\mathtt{w}}}-1\}$ has size $ \ensuremath{\mathtt{n}}\le 2^{\ensuremath{\mathtt{w}}}$, so that $ \log \ensuremath{\mathtt{n}} \le \ensuremath{\mathtt{w}}$. All the other SSet implementations discussed in this book perform all operations in $ O(\log \ensuremath{\mathtt{n}})$ time so they are all at least as fast as a BinaryTrie.

The second structure, the XFastTrie, speeds up the search in a BinaryTrie by using hashing. With this speedup, the $ \mathtt{find(x)}$ operation runs in $ O(\log \ensuremath{\mathtt{w}})$ time. However, $ \mathtt{add(x)}$ and $ \mathtt{remove(x)}$ operations in an XFastTrie still take $ O(\ensuremath{\mathtt{w}})$ time and the space used by an XFastTrie is $ O(\ensuremath{\mathtt{n}}\cdot\ensuremath{\mathtt{w}})$.

The third data structure, the YFastTrie, uses an XFastTrie to store only a sample of roughly one out of every $ \ensuremath{\mathtt{w}}$ elements and stores the remaining elements a standard SSet structure. This trick reduces the running time of $ \mathtt{add(x)}$ and $ \mathtt{remove(x)}$ to $ O(\log \ensuremath{\mathtt{w}})$ and decreases the space to $ O(\ensuremath{\mathtt{n}})$.

The implementations used as examples in this chapter can store any type of data, as long an integer can be associated with it. In the code samples, the variable $ \mathtt{ix}$ is always the integer value associated with $ \mathtt{x}$, and the method $ \mathtt{in.}$ $ \mathtt{intValue(x)}$ converts $ \mathtt{x}$ to its associated integer. In the text, however, we will simply treat $ \mathtt{x}$ as if it is an integer.