A encode a set of bit integers in a binary tree. All leaves in the tree have depth and each integer is encoded as a root-to-leaf path. The path for the integer turns left at level if the th most significant bit of is a 0 and turns right if it is a 1. Figure 13.1 shows an example for the case , in which the trie stores the integers 3(0011), 9(1001), 12(1100), and 13(1101).
Because the search path for a value depends on the bits of it will be helpful to name the children of a node, , ( ) and ( ). These child pointers will actually serve double-duty. Since the leaves in a binary trie have no children, the pointers are used to string the leaves together into a doubly-linked list. For a leaf in the binary trie ( ) is the node that comes before in the list and ( ) is the node that follows in the list. A special node, , is used both before the first node and after the last node in the list (see Section 3.2). In the code samples, , , and refer to the same field in the node , as do , , and .
Each node, , also contains an additional pointer . If 's left child is missing, then points to the smallest leaf in 's subtree. If 's right child is missing, then points to the largest leaf in 's subtree. An example of a , showing pointers and the doubly-linked list at the leaves, is shown in Figure 13.2
The operation in a is fairly straightforward. We try to follow the search path for in the trie. If we reach a leaf, then we have found . If, we reach a node where we cannot proceed (because is missing a child) then we follow , which takes us either to smallest leaf larger than or the largest leaf smaller than . Which of these two cases occurs depends on whether is missing its left or right child, respectively. In the former case ( is missing its left child), we have found the value we are looking for. In the latter case ( is missing its right child), we can use the linked list to reach the value we are looking for. Each of these cases is illustrated in Figure 13.3.
T find(T x) { int i, c = 0; unsigned ix = intValue(x); Node *u = &r; for (i = 0; i < w; i++) { c = (ix >> (w-i-1)) & 1; if (u->child[c] == NULL) break; u = u->child[c]; } if (i == w) return u->x; // found it u = (c == 0) ? u->jump : u->jump->next; return u == &dummy ? NULL : u->x; }The running-time of the method is dominated by the time it takes to follow a root-to-leaf path, so it runs in time.
The operation in a is fairly straightforward, but it has a lot of things to take care of:
bool add(T x) { int i, c = 0; unsigned ix = intValue(x); Node *u = &r; // 1 - search for ix until falling out of the trie for (i = 0; i < w; i++) { c = (ix >> (w-i-1)) & 1; if (u->child[c] == NULL) break; u = u->child[c]; } if (i == w) return false; // trie already contains x - abort Node *pred = (c == right) ? u->jump : u->jump->left; // save for step 3 u->jump = NULL; // u will have two children shortly // 2 - add path to ix for (; i < w; i++) { c = (ix >> (w-i-1)) & 1; u->child[c] = new Node(); u->child[c]->parent = u; u = u->child[c]; } u->x = x; // 3 - add u to linked list u->prev = pred; u->next = pred->next;; u->prev->next = u; u->next->prev = u; // 4 - walk back up, updating jump pointers Node *v = u->parent; while (v != NULL) { if ((v->left == NULL && (v->jump == NULL || intValue(v->jump->x) > ix)) || (v->right == NULL && (v->jump == NULL || intValue(v->jump->x) < ix))) v->jump = u; v = v->parent; } n++; return true; }This method performs one walk down the search path for and one walk back up. Each step of these walks takes constant time, so the runs in time.
The operation undoes the work of . Like , it has a lot of things to take care of:
bool remove(T x) { // 1 - find leaf, u, containing x int i = 0, c; unsigned ix = intValue(x); Node *u = &r; for (i = 0; i < w; i++) { c = (ix >> (w-i-1)) & 1; if (u->child[c] == NULL) return false; u = u->child[c]; } // 2 - remove u from linked list u->prev->next = u->next; u->next->prev = u->prev; Node *v = u; // 3 - delete nodes on path to u for (i = w-1; i >= 0; i--) { c = (ix >> (w-i-1)) & 1; v = v->parent; delete v->child[c]; v->child[c] = NULL; if (v->child[1-c] != NULL) break; } // 4 - update jump pointers v->jump = u; for (; i >= 0; i--) { c = (ix >> (w-i-1)) & 1; if (v->jump == u) v->jump = u->child[1-c]; v = v->parent; } n--; return true; }
opendatastructures.org