13.4 Discussion and Exercises

The first data structure to provide $ O(\log \ensuremath{\mathtt{w}})$ time $ \mathtt{add(x)}$, $ \mathtt{remove(x)}$, and $ \mathtt{find(x)}$ operations was proposed by van Emde Boas and has since become known as the van Emde Boas (or stratified) tree. The original van Emde Boas structure had size $ 2^{\ensuremath{\mathtt{w}}}$, so was impractical for large integers.

The XFastTrie and YFastTrie data structures were discovered by Willard [64]. The XFastTrie structure is very closely related to van Emde Boas trees. One view of this is that the hash tables in an XFastTrie replace arrays in a van Emde Boas tree. That is, instead of storing the hash table $ \mathtt{t[i]}$, a van Emde Boas tree stores an array of length $ 2^{\ensuremath{\mathtt{i}}}$.

Another structure for storing integers is Fredman and Willard's fusion trees [24]. This structure can store $ \mathtt{n}$ $ \mathtt{w}$-bit integers in $ O(\ensuremath{\mathtt{n}})$ space so that the $ \mathtt{find(x)}$ operation runs in $ O((\log \ensuremath{\mathtt{n}})/(\log
\ensuremath{\mathtt{w}}))$ time. By using a fusion tree when $ \log \ensuremath{\mathtt{w}} > \sqrt{\log \ensuremath{\mathtt{n}}}$ and a YFastTrie when $ \log \ensuremath{\mathtt{w}} \le \sqrt{\log \ensuremath{\mathtt{n}}}$ one obtains an $ O(\ensuremath{\mathtt{n}})$ space data structure that can implement the $ \mathtt{find(x)}$ operation in $ O(\sqrt{\log \ensuremath{\mathtt{n}}})$ time. Recent lower-bound results of P{\v{a\/}}\kern.05emtra{\c{s\/}}cu and Thorup [50] show that these results are optimal, at least for structures that use only $ O(\ensuremath{\mathtt{n}})$ space.

Exercise 13..1   Design and implement a simplified version of a BinaryTrie that doesn't have a linked list or jump pointers, but can still do find(x) in O(w) time.

Exercise 13..2   Design and implement a simplified implementation of an XFastTrie that doesn't use a binary trie at all. Instead, you implementation should store everything is stored in a doubly-linked list and $ \ensuremath{\mathtt{w}}$ hash tables.

Exercise 13..3   For an element $ \mathtt{x}$, let $ \mathtt{d(x)}$ be the difference between $ \mathtt{x}$ and the value returned by $ \mathtt{find(x)}$ [if $ \mathtt{find(x)}$ returns $ \mathtt{null}$, then define $ \mathtt{d(x)}$. Design and implement a modified version of the $ \mathtt{find(x)}$ operation in an XFastTrie that runs in $ O(\log \ensuremath{\mathtt{d(x)}})$ expected time.

opendatastructures.org