Loading web-font TeX/Math/Italic
Open Data Struc­tures
\newcommand{\E}{\mathrm{E}} \DeclareMathOperator{\ddiv}{div}
Con­tents
4 Skiplists

In this chap­ter, we dis­cuss a beau­ti­ful data struc­ture: the skiplist, which has a va­ri­ety of ap­pli­ca­tions. Using a skiplist we can im­ple­ment a List that has O(\log n) time im­ple­men­ta­tions of get(i), set(i,x), add(i,x), and remove(i). We can also im­ple­ment an SSet in which all op­er­a­tions run in O(\log \texttt{n}) ex­pected time.

The ef­fi­ciency of skiplists re­lies on their use of ran­dom­iza­tion. When a new el­e­ment is added to a skiplist, the skiplist uses ran­dom coin tosses to de­ter­mine the height of the new el­e­ment. The per­for­mance of skiplists is ex­pressed in terms of ex­pected run­ning times and path lengths. This ex­pec­ta­tion is taken over the ran­dom coin tosses used by the skiplist. In the im­ple­men­ta­tion, the ran­dom coin tosses used by a skiplist are sim­u­lated using a pseudo-ran­dom num­ber (or bit) gen­er­a­tor.

4.1 The Basic Struc­ture

Con­cep­tu­ally, a skiplist is a se­quence of singly-linked lists L_0,\ldots,L_h. Each list L_r con­tains a sub­set of the items in L_{r-1}. We start with the input list L_0 that con­tains n items and con­struct L_1 from L_0, L_2 from L_1, and so on. The items in L_r are ob­tained by toss­ing a coin for each el­e­ment, x, in L_{r-1} and in­clud­ing x in L_r if the coin turns up as heads. This process ends when we cre­ate a list L_r that is empty. An ex­am­ple of a skiplist is shown in Fig­ure 4.1.

Fig­ure 4.1 A skiplist con­tain­ing seven el­e­ments.

For an el­e­ment, x, in a skiplist, we call the height of x the largest value r such that x ap­pears in L_r. Thus, for ex­am­ple, el­e­ments that only ap­pear in L_0 have height 0. If we spend a few mo­ments think­ing about it, we no­tice that the height of x cor­re­sponds to the fol­low­ing ex­per­i­ment: Toss a coin re­peat­edly until it comes up as tails. How many times did it come up as heads? The an­swer, not sur­pris­ingly, is that the ex­pected height of a node is 1. (We ex­pect to toss the coin twice be­fore get­ting tails, but we don't count the last toss.) The height of a skiplist is the height of its tallest node.

At the head of every list is a spe­cial node, called the sen­tinel, that acts as a dummy node for the list. The key prop­erty of skiplists is that there is a short path, called the search path, from the sen­tinel in L_h to every node in L_0. Re­mem­ber­ing how to con­struct a search path for a node, u, is easy (see Fig­ure 4.2) : Start at the top left cor­ner of your skiplist (the sen­tinel in L_h) and al­ways go right un­less that would over­shoot u, in which case you should take a step down into the list below.

More pre­cisely, to con­struct the search path for the node u in L_0, we start at the sen­tinel, w, in L_h. Next, we ex­am­ine w.next. If w.next con­tains an item that ap­pears be­fore u in L_0, then we set \texttt{w}=\texttt{w.next}. Oth­er­wise, we move down and con­tinue the search at the oc­cur­rence of w in the list L_{h-1}. We con­tinue this way until we reach the pre­de­ces­sor of u in L_0.

Fig­ure 4.2 The search path for the node con­tain­ing 4 in a skiplist.

The fol­low­ing re­sult, which we will prove in Sec­tion 4.4, shows that the search path is quite short:

Lemma 4.1 The ex­pected length of the search path for any node, u, in L_0 is at most 2\log \texttt{n} + O(1) = O(\log \texttt{n}).

A space-ef­fi­cient way to im­ple­ment a skiplist is to de­fine a Node, u, as con­sist­ing of a data value, x, and an array, next, of point­ers, where u.next[i] points to u's suc­ces­sor in the list L_{\texttt{i}}. In this way, the data, x, in a node is ref­er­enced only once, even though x may ap­pear in sev­eral lists.

    class Node<T> {
        T x;
        Node<T>[] next;
        Node(T ix, int h) {
            x = ix;
            next = (Node<T>[])Array.newInstance(Node.class, h+1);
        }
        int height() {
            return next.length - 1;
        }
    }

The next two sec­tions of this chap­ter dis­cuss two dif­fer­ent ap­pli­ca­tions of skiplists. In each of these ap­pli­ca­tions, L_0 stores the main struc­ture (a list of el­e­ments or a sorted set of el­e­ments). The pri­mary dif­fer­ence be­tween these struc­tures is in how a search path is nav­i­gated; in par­tic­u­lar, they dif­fer in how they de­cide if a search path should go down into L_{r-1} or go right within L_r.

4.2 SkiplistSSet: An Ef­fi­cient SSet

A SkiplistSSet uses a skiplist struc­ture to im­ple­ment the SSet in­ter­face. When used in this way, the list L_0 stores the el­e­ments of the SSet in sorted order. The find(x) method works by fol­low­ing the search path for the small­est value y such that \texttt{y}\ge\texttt{x}:

    T find(T x) {
        Node<T> u = findPredNode(x);
        return u.next[0] == null ? null : u.next[0].x;
    }
    Node<T> findPredNode(T x) {
        Node<T> u = sentinel;
        int r = h;
        while (r >= 0) {
            while (u.next[r] != null && c.compare(u.next[r].x,x) < 0)
                u = u.next[r];   // go right in list r
            r--;               // go down into list r-1
        }
        return u;
    }

Fol­low­ing the search path for y is easy: when sit­u­ated at some node, u, in L_{\texttt{r}}, we look right to u.next[r].x. If \texttt{x}>\texttt{u.next[r].x}, then we take a step to the right in L_{\texttt{r}}; oth­er­wise, we move down into L_{\texttt{r}-1}. Each step (right or down) in this search takes only con­stant time; thus, by Lemma 4.1, the ex­pected run­ning time of find(x) is O(\log \texttt{n}).

Be­fore we can add an el­e­ment to a SkipListSSet, we need a method to sim­u­late toss­ing coins to de­ter­mine the height, k, of a new node. We do so by pick­ing a ran­dom in­te­ger, z, and count­ing the num­ber of trail­ing 1s in the bi­nary rep­re­sen­ta­tion of z:7This method does not ex­actly repli­cate the coin-toss­ing ex­per­i­ment since the value of k will al­ways be less than the num­ber of bits in an int. How­ever, this will have neg­li­gi­ble im­pact un­less the num­ber of el­e­ments in the struc­ture is much greater than 2^{32}=4294967296.

    int pickHeight() {
        int z = rand.nextInt();
        int k = 0;
        int m = 1;
        while ((z & m) != 0) {
            k++;
            m <<= 1;
        }
        return k;
    }

To im­ple­ment the add(x) method in a SkiplistSSet we search for x and then splice x into a few lists L_0,…,L_{\texttt{k}}, where k is se­lected using the pickHeight() method. The eas­i­est way to do this is to use an array, stack, that keeps track of the nodes at which the search path goes down from some list L_{\texttt{r}} into L_{\texttt{r}-1}. More pre­cisely, stack[r] is the node in L_{\texttt{r}} where the search path pro­ceeded down into L_{\texttt{r}-1}. The nodes that we mod­ify to in­sert x are pre­cisely the nodes \texttt{stack[0]},\ldots,\texttt{stack[k]}. The fol­low­ing code im­ple­ments this al­go­rithm for add(x):

    boolean add(T x) {
        Node<T> u = sentinel;
        int r = h;
        int comp = 0;
        while (r >= 0) {
            while (u.next[r] != null 
                   && (comp = c.compare(u.next[r].x,x)) < 0)
                u = u.next[r];
            if (u.next[r] != null && comp == 0) return false;
            stack[r--] = u;          // going down, store u
        }
        Node<T> w = new Node<T>(x, pickHeight());
        while (h < w.height())
            stack[++h] = sentinel;   // height increased
        for (int i = 0; i < w.next.length; i++) {
            w.next[i] = stack[i].next[i];
            stack[i].next[i] = w;
        }
        n++;
        return true;
    }

Fig­ure 4.3 Adding the node con­tain­ing 3.5 to a skiplist. The nodes stored in stack are high­lighted.

Re­mov­ing an el­e­ment, x, is done in a sim­i­lar way, ex­cept that there is no need for stack to keep track of the search path. The re­moval can be done as we are fol­low­ing the search path. We search for x and each time the search moves down­ward from a node u, we check if \texttt{u.next.x}=\texttt{x} and if so, we splice u out of the list:

    boolean remove(T x) {
        boolean removed = false;
        Node<T> u = sentinel;
        int r = h;
        int comp = 0;
        while (r >= 0) {
            while (u.next[r] != null 
                   && (comp = c.compare(u.next[r].x, x)) < 0) {
                u = u.next[r];
            }
            if (u.next[r] != null && comp == 0) {
                removed = true;
                u.next[r] = u.next[r].next[r];
                if (u == sentinel && u.next[r] == null)
                    h--;  // height has gone down
            }
            r--;
        }
        if (removed) n--;
        return removed;
    }

Fig­ure 4.4 Re­mov­ing the node con­tain­ing 3 from a skiplist.

4.2.1 Sum­mary

The fol­low­ing the­o­rem sum­ma­rizes the per­for­mance of skiplists when used to im­ple­ment sorted sets:

The­o­rem 4.1 SkiplistSSet im­ple­ments the SSet in­ter­face. A SkiplistSSet sup­ports the op­er­a­tions add(x), remove(x), and find(x) in O(\log \texttt{n}) ex­pected time per op­er­a­tion.

4.3 SkiplistList: An Ef­fi­cient Ran­dom-Ac­cess List

A SkiplistList im­ple­ments the List in­ter­face using a skiplist struc­ture. In a SkiplistList, L_0 con­tains the el­e­ments of the list in the order in which they ap­pear in the list. As in a SkiplistSSet, el­e­ments can be added, re­moved, and ac­cessed in O(\log \texttt{n}) time.

For this to be pos­si­ble, we need a way to fol­low the search path for the ith el­e­ment in L_0. The eas­i­est way to do this is to de­fine the no­tion of the length of an edge in some list, L_{\texttt{r}}. We de­fine the length of every edge in L_{0} as 1. The length of an edge, e, in L_{\texttt{r}}, \texttt{r}>0, is de­fined as the sum of the lengths of the edges below e in L_{\texttt{r}-1}. Equiv­a­lently, the length of e is the num­ber of edges in L_0 below e. See Fig­ure 4.5 for an ex­am­ple of a skiplist with the lengths of its edges shown. Since the edges of skiplists are stored in ar­rays, the lengths can be stored the same way:

Fig­ure 4.5 The lengths of the edges in a skiplist.

    class Node {
        T x;
        Node[] next;
        int[] length;
        Node(T ix, int h) {
            x = ix;
            next = (Node[])Array.newInstance(Node.class, h+1);
            length = new int[h+1];
        }
        int height() {
            return next.length - 1;
        }
    }

The use­ful prop­erty of this de­f­i­n­i­tion of length is that, if we are cur­rently at a node that is at po­si­tion j in L_0 and we fol­low an edge of length \ell, then we move to a node whose po­si­tion, in L_0, is \texttt{j}+\ell. In this way, while fol­low­ing a search path, we can keep track of the po­si­tion, j, of the cur­rent node in L_0. When at a node, u, in L_{\texttt{r}}, we go right if j plus the length of the edge u.next[r] is less than i. Oth­er­wise, we go down into L_{\texttt{r}-1}.

    Node findPred(int i) {
        Node u = sentinel;
        int r = h;
        int j = -1;   // index of the current node in list 0
        while (r >= 0) {
            while (u.next[r] != null && j + u.length[r] < i) {
                j += u.length[r];
                u = u.next[r];
            }
            r--;
        }
        return u;
    }
    T get(int i) {
        return findPred(i).next[0].x;
    }
    T set(int i, T x) {
        Node u = findPred(i).next[0];
        T y = u.x;
        u.x = x;
        return y;
    }

Since the hard­est part of the op­er­a­tions get(i) and set(i,x) is find­ing the ith node in L_0, these op­er­a­tions run in O(\log \texttt{n}) time.

Adding an el­e­ment to a SkiplistList at a po­si­tion, i, is fairly sim­ple. Un­like in a SkiplistSSet, we are sure that a new node will ac­tu­ally be added, so we can do the ad­di­tion at the same time as we search for the new node's lo­ca­tion. We first pick the height, k, of the newly in­serted node, w, and then fol­low the search path for i. Any time the search path moves down from L_{\texttt{r}} with \texttt{r}\le \texttt{k}, we splice w into L_{\texttt{r}}. The only extra care needed is to en­sure that the lengths of edges are up­dated prop­erly. See Fig­ure 4.6.

Fig­ure 4.6 Adding an el­e­ment to a SkiplistList.

Note that, each time the search path goes down at a node, u, in L_{\texttt{r}}, the length of the edge u.next[r] in­creases by one, since we are adding an el­e­ment below that edge at po­si­tion i. Splic­ing the node w be­tween two nodes, u and z, works as shown in Fig­ure 4.7. While fol­low­ing the search path we are al­ready keep­ing track of the po­si­tion, j, of u in L_0. There­fore, we know that the length of the edge from u to w is \texttt{i}-\texttt{j}. We can also de­duce the length of the edge from w to z from the length, \ell, of the edge from u to z. There­fore, we can splice in w and up­date the lengths of the edges in con­stant time.

Fig­ure 4.7 Up­dat­ing the lengths of edges while splic­ing a node w into a skiplist.

This sounds more com­pli­cated than it is, for the code is ac­tu­ally quite sim­ple:

    void add(int i, T x) {
        Node w = new Node(x, pickHeight());
        if (w.height() > h) 
            h = w.height();
        add(i, w);
    }
    Node add(int i, Node w) {
        Node u = sentinel;
        int k = w.height();
        int r = h;
        int j = -1; // index of u
        while (r >= 0) {
            while (u.next[r] != null && j+u.length[r] < i) {
                j += u.length[r];
                u = u.next[r];
            }
            u.length[r]++; // accounts for new node in list 0
            if (r <= k) {
                w.next[r] = u.next[r];
                u.next[r] = w;
                w.length[r] = u.length[r] - (i - j);
                u.length[r] = i - j;
            }
            r--;
        }
        n++;
        return u;
    }

By now, the im­ple­men­ta­tion of the remove(i) op­er­a­tion in a SkiplistList should be ob­vi­ous. We fol­low the search path for the node at po­si­tion i. Each time the search path takes a step down from a node, u, at level r we decre­ment the length of the edge leav­ing u at that level. We also check if u.next[r] is the el­e­ment of rank i and, if so, splice it out of the list at that level. An ex­am­ple is shown in Fig­ure 4.8.

Fig­ure 4.8 Re­mov­ing an el­e­ment from a SkiplistList.
    T remove(int i) {
        T x = null;
        Node u = sentinel;
        int r = h;
        int j = -1; // index of node u
        while (r >= 0) {
            while (u.next[r] != null && j+u.length[r] < i) {
                j += u.length[r];
                u = u.next[r];
            }
            u.length[r]--;  // for the node we are removing
            if (j + u.length[r] + 1 == i && u.next[r] != null) {
                x = u.next[r].x;
                u.length[r] += u.next[r].length[r];
                u.next[r] = u.next[r].next[r];
                if (u == sentinel && u.next[r] == null)
                    h--;
            }
            r--;
        }
        n--;
        return x;
    }

4.3.1 Sum­mary

The fol­low­ing the­o­rem sum­ma­rizes the per­for­mance of the SkiplistList data struc­ture:

The­o­rem 4.2 A SkiplistList im­ple­ments the List in­ter­face. A SkiplistList sup­ports the op­er­a­tions get(i), set(i,x), add(i,x), and remove(i) in O(\log \texttt{n}) ex­pected time per op­er­a­tion.

4.4 Analy­sis of Skiplists

In this sec­tion, we an­a­lyze the ex­pected height, size, and length of the search path in a skiplist. This sec­tion re­quires a back­ground in basic prob­a­bil­ity. Sev­eral proofs are based on the fol­low­ing basic ob­ser­va­tion about coin tosses.

Lemma 4.2 Let T be the num­ber of times a fair coin is tossed up to and in­clud­ing the first time the coin comes up heads. Then \E[T]=2.

Proof Sup­pose we stop toss­ing the coin the first time it comes up heads. De­fine the in­di­ca­tor vari­able \begin{equation*} I_{i} = \left\{\begin{array} 0 & \mbox{if the coin is tossed less than \(i\) times} \\ 1 & \mbox{if the coin is tossed \(i\) or more times} \end{array}\right. \end{equation*}
Note that I_i=1 if and only if the first i-1 coin tosses are tails, so \E[I_i]=\Pr\{I_i=1\}=1/2^{i-1}. Ob­serve that T, the total num­ber of coin tosses, can be writ­ten as T=\sum_{i=1}^{\infty} I_i. There­fore, \begin{align*} \E[T] & = \E\left[\sum_{i=1}^\infty I_i\right] \\ & = \sum_{i=1}^\infty \E\left[I_i\right] \\ & = \sum_{i=1}^\infty 1/2^{i-1} \\ & = 1 + 1/2 + 1/4 + 1/8 + \cdots \\ & = 2 \enspace . \end{align*}

The next two lem­mata tell us that skiplists have lin­ear size:

Lemma 4.3 The ex­pected num­ber of nodes in a skiplist con­tain­ing \texttt{n} el­e­ments, not in­clud­ing oc­cur­rences of the sen­tinel, is 2\texttt{n}.

Proof The prob­a­bil­ity that any par­tic­u­lar el­e­ment, x, is in­cluded in list L_{\texttt{r}} is 1/2^{\texttt{r}}, so the ex­pected num­ber of nodes in L_{\texttt{r}} is \texttt{n}/2^{\texttt{r}}.8See Sec­tion 1.3.4 to see how this is de­rived using in­di­ca­tor vari­ables and lin­ear­ity of ex­pec­ta­tion. There­fore, the total ex­pected num­ber of nodes in all lists is \begin{equation*} \sum_{\texttt{r}=0}^\infty \texttt{n}/2^{\texttt{r}} = \texttt{n}(1+1/2+1/4+1/8+\cdots) = 2\texttt{n} \enspace . \end{equation*}

Lemma 4.4 The ex­pected height of a skiplist con­tain­ing n el­e­ments is at most \log \texttt{n} + 2.

Proof For each \texttt{r}\in\{1,2,3,\ldots,\infty\}, de­fine the in­di­ca­tor ran­dom vari­able \begin{equation*} I_{\texttt{r}} = \left\{\begin{array} 0 & \mbox{if \(L_{\texttt{r}}\) is empty} \\ 1 & \mbox{if \(L_{\texttt{r}}\) is non-empty} \end{array}\right. \end{equation*}
The height, h, of the skiplist is then given by \begin{equation*} \texttt{h} = \sum_{r=1}^\infty I_{\texttt{r}} \enspace . \end{equation*}
Note that I_{\texttt{r}} is never more than the length, |L_{\texttt{r}}|, of L_{\texttt{r}}, so \begin{equation*} \E[I_{\texttt{r}}] \le \E[|L_{\texttt{r}}|] = \texttt{n}/2^{\texttt{r}} \enspace . \end{equation*}
There­fore, we have \begin{align*} \E[\texttt{h}] &= \E\left[\sum_{r=1}^\infty I_{\texttt{r}}\right] \\ &= \sum_{\texttt{r}=1}^{\infty} E[I_{\texttt{r}}] \\ &= \sum_{\texttt{r}=1}^{\lfloor\log \texttt{n}\rfloor} E[I_{\texttt{r}}] + \sum_{r=\lfloor\log \texttt{n}\rfloor+1}^{\infty} E[I_{\texttt{r}}] \\ &\le \sum_{\texttt{r}=1}^{\lfloor\log \texttt{n}\rfloor} 1 + \sum_{r=\lfloor\log \texttt{n}\rfloor+1}^{\infty} \texttt{n}/2^{\texttt{r}} \\ &\le \log \texttt{n} + \sum_{\texttt{r}=0}^\infty 1/2^{\texttt{r}} \\ &= \log \texttt{n} + 2 \enspace . \end{align*}

Lemma 4.5 The ex­pected num­ber of nodes in a skiplist con­tain­ing \texttt{n} el­e­ments, in­clud­ing all oc­cur­rences of the sen­tinel, is 2\texttt{n}+O(\log \texttt{n}).

Proof By Lemma 4.3, the ex­pected num­ber of nodes, not in­clud­ing the sen­tinel, is 2\texttt{n}. The num­ber of oc­cur­rences of the sen­tinel is equal to the height, \texttt{h}, of the skiplist so, by Lemma 4.4 the ex­pected num­ber of oc­cur­rences of the sen­tinel is at most \log \texttt{n}+2 = O(\log \texttt{n}).

Lemma 4.6 The ex­pected length of a search path in a skiplist is at most 2\log \texttt{n} + O(1).

Proof The eas­i­est way to see this is to con­sider the re­verse search path for a node, x. This path starts at the pre­de­ces­sor of x in L_0. At any point in time, if the path can go up a level, then it does. If it can­not go up a level then it goes left. Think­ing about this for a few mo­ments will con­vince us that the re­verse search path for x is iden­ti­cal to the search path for x, ex­cept that it is re­versed.

The num­ber of nodes that the re­verse search path vis­its at a par­tic­u­lar level, r, is re­lated to the fol­low­ing ex­per­i­ment: Toss a coin. If the coin comes up as heads, then move up and stop. Oth­er­wise, move left and re­peat the ex­per­i­ment. The num­ber of coin tosses be­fore the heads rep­re­sents the num­ber of steps to the left that a re­verse search path takes at a par­tic­u­lar level.9Note that this might over­count the num­ber of steps to the left, since the ex­per­i­ment should end ei­ther at the first heads or when the search path reaches the sen­tinel, whichever comes first. This is not a prob­lem since the lemma is only stat­ing an upper bound. Lemma 4.2 tells us that the ex­pected num­ber of coin tosses be­fore the first heads is 1.

Let S_{\texttt{r}} de­note the num­ber of steps the for­ward search path takes at level \texttt{r} that go to the right. We have just ar­gued that \E[S_{\texttt{r}}]\le 1. Fur­ther­more, S_{\texttt{r}}\le |L_{\texttt{r}}|, since we can't take more steps in L_{\texttt{r}} than the length of L_{\texttt{r}}, so \begin{equation*} \E[S_{\texttt{r}}] \le \E[|L_{\texttt{r}}|] = \texttt{n}/2^{\texttt{r}} \enspace . \end{equation*}

We can now fin­ish as in the proof of Lemma 4.4. Let S be the length of the search path for some node, u, in a skiplist, and let \texttt{h} be the height of the skiplist. Then \begin{align*} \E[S] &= \E\left[ \texttt{h} + \sum_{\texttt{r}=0}^\infty S_{\texttt{r}} \right] \\ &= \E[\texttt{h}] + \sum_{\texttt{r}=0}^\infty \E[S_{\texttt{r}}] \\ &= \E[\texttt{h}] + \sum_{\texttt{r}=0}^{\lfloor\log \texttt{n}\rfloor} \E[S_{\texttt{r}}] + \sum_{\texttt{r}=\lfloor\log \texttt{n}\rfloor+1}^\infty \E[S_{\texttt{r}}] \\ &\le \E[\texttt{h}] + \sum_{\texttt{r}=0}^{\lfloor\log \texttt{n}\rfloor} 1 + \sum_{r=\lfloor\log \texttt{n}\rfloor+1}^\infty \texttt{n}/2^{\texttt{r}} \\ &\le \E[\texttt{h}] + \sum_{\texttt{r}=0}^{\lfloor\log \texttt{n}\rfloor} 1 + \sum_{\texttt{r}=0}^{\infty} 1/2^{\texttt{r}} \\ &\le \E[\texttt{h}] + \sum_{\texttt{r}=0}^{\lfloor\log \texttt{n}\rfloor} 1 + \sum_{\texttt{r}=0}^{\infty} 1/2^{\texttt{r}} \\ &\le \E[\texttt{h}] + \log \texttt{n} + 3 \\ &\le 2\log \texttt{n} + 5 \enspace . \end{align*}

The fol­low­ing the­o­rem sum­ma­rizes the re­sults in this sec­tion:

The­o­rem 4.3 A skiplist con­tain­ing \texttt{n} el­e­ments has ex­pected size O(\texttt{n}) and the ex­pected length of the search path for any par­tic­u­lar el­e­ment is at most 2\log \texttt{n} + O(1).

4.5 Dis­cus­sion and Ex­er­cises

Skiplists were in­tro­duced by Pugh [62] who also pre­sented a num­ber of ap­pli­ca­tions and ex­ten­sions of skiplists [61]. Since then they have been stud­ied ex­ten­sively. Sev­eral re­searchers have done very pre­cise analy­ses of the ex­pected length and vari­ance of the length of the search path for the ith el­e­ment in a skiplist [45,44,58]. De­ter­min­is­tic ver­sions [53], bi­ased ver­sions [8,26], and self-ad­just­ing ver­sions [12] of skiplists have all been de­vel­oped. Skiplist im­ple­men­ta­tions have been writ­ten for var­i­ous lan­guages and frame­works and have been used in open-source data­base sys­tems [71,63]. A vari­ant of skiplists is used in the HP-UX op­er­at­ing sys­tem ker­nel's process man­age­ment struc­tures [42]. Skiplists are even part of the Java 1.6 API [55].

Ex­er­cise 4.1 Il­lus­trate the search paths for 2.5 and 5.5 on the skiplist in Fig­ure 4.1.

Ex­er­cise 4.2 Il­lus­trate the ad­di­tion of the val­ues 0.5 (with a height of 1) and then 3.5 (with a height of 2) to the skiplist in Fig­ure 4.1.

Ex­er­cise 4.3 Il­lus­trate the re­moval of the val­ues 1 and then 3 from the skiplist in Fig­ure 4.1.

Ex­er­cise 4.4 Il­lus­trate the ex­e­cu­tion of remove(2) on the SkiplistList in Fig­ure 4.5.

Ex­er­cise 4.5 Il­lus­trate the ex­e­cu­tion of add(3,x) on the SkiplistList in Fig­ure 4.5. As­sume that pickHeight() se­lects a height of 4 for the newly cre­ated node.

Ex­er­cise 4.6 Show that, dur­ing an add(x) or a remove(x) op­er­a­tion, the ex­pected num­ber of point­ers in a SkiplistSet that get changed is con­stant.

Ex­er­cise 4.7 Sup­pose that, in­stead of pro­mot­ing an el­e­ment from L_{i-1} into L_i based on a coin toss, we pro­mote it with some prob­a­bil­ity p, 0 < p < 1.
  1. Show that, with this mod­i­fi­ca­tion, the ex­pected length of a search path is at most (1/p)\log_{1/p} \texttt{n} + O(1).
  2. What is the value of p that min­i­mizes the pre­ced­ing ex­pres­sion?
  3. What is the ex­pected height of the skiplist?
  4. What is the ex­pected num­ber of nodes in the skiplist?

Ex­er­cise 4.8 The find(x) method in a SkiplistSet some­times per­forms re­dun­dant com­par­isons; these occur when x is com­pared to the same value more than once. They can occur when, for some node, u, \texttt{u.next[r]} = \texttt{u.next[r-1]}. Show how these re­dun­dant com­par­isons hap­pen and mod­ify find(x) so that they are avoided. An­a­lyze the ex­pected num­ber of com­par­isons done by your mod­i­fied find(x) method.

Ex­er­cise 4.9 De­sign and im­ple­ment a ver­sion of a skiplist that im­ple­ments the SSet in­ter­face, but also al­lows fast ac­cess to el­e­ments by rank. That is, it also sup­ports the func­tion get(i), which re­turns the el­e­ment whose rank is i in O(\log \texttt{n}) ex­pected time. (The rank of an el­e­ment x in an SSet is the num­ber of el­e­ments in the SSet that are less than x.)

Ex­er­cise 4.10 A fin­ger in a skiplist is an array that stores the se­quence of nodes on a search path at which the search path goes down. (The vari­able stack in the add(x) code on page X is a fin­ger; the shaded nodes in Fig­ure 4.3 show the con­tents of the fin­ger.) One can think of a fin­ger as point­ing out the path to a node in the low­est list, L_0.

A fin­ger search im­ple­ments the find(x) op­er­a­tion using a fin­ger, by walk­ing up the list using the fin­ger until reach­ing a node u such that \texttt{u.x} < \texttt{x} and \texttt{u.next}=\texttt{null} or \texttt{u.next.x} > \texttt{x} and then per­form­ing a nor­mal search for x start­ing from u. It is pos­si­ble to prove that the ex­pected num­ber of steps re­quired for a fin­ger search is O(1+\log r), where r is the num­ber val­ues in L_0 be­tween x and the value pointed to by the fin­ger.

Im­ple­ment a sub­class of Skiplist called SkiplistWithFinger that im­ple­ments find(x) op­er­a­tions using an in­ter­nal fin­ger. This sub­class stores a fin­ger, which is then used so that every find(x) op­er­a­tion is im­ple­mented as a fin­ger search. Dur­ing each find(x) op­er­a­tion the fin­ger is up­dated so that each find(x) op­er­a­tion uses, as a start­ing point, a fin­ger that points to the re­sult of the pre­vi­ous find(x) op­er­a­tion.

Ex­er­cise 4.11 Write a method, truncate(i), that trun­cates a SkiplistList at po­si­tion i. After the ex­e­cu­tion of this method, the size of the list is i and it con­tains only the el­e­ments at in­dices 0,\ldots,\texttt{i}-1. The re­turn value is an­other SkiplistList that con­tains the el­e­ments at in­dices \texttt{i},\ldots,\texttt{n}-1. This method should run in O(\log \texttt{n}) time.

Ex­er­cise 4.12 Write a SkiplistList method, absorb(l2), that takes as an ar­gu­ment a SkiplistList, l2, emp­ties it and ap­pends its con­tents, in order, to the re­ceiver. For ex­am­ple, if l1 con­tains a,b,c and l2 con­tains d,e,f, then after call­ing l1.absorb(l2), l1 will con­tain a,b,c,d,e,f and l2 will be empty. This method should run in O(\log \texttt{n}) time.

Ex­er­cise 4.13 Using the ideas from the space-ef­fi­cient list, SEList, de­sign and im­ple­ment a space-ef­fi­cient SSet, SESSet. To do this, store the data, in order, in an SEList, and store the blocks of this SEList in an SSet. If the orig­i­nal SSet im­ple­men­ta­tion uses O(\texttt{n}) space to store n el­e­ments, then the SESSet will use enough space for n el­e­ments plus O(\texttt{n}/\texttt{b}+\texttt{b}) wasted space.

Ex­er­cise 4.14 Using an SSet as your un­der­ly­ing struc­ture, de­sign and im­ple­ment an ap­pli­ca­tion that reads a (large) text file and al­lows you to search, in­ter­ac­tively, for any sub­string con­tained in the text. As the user types their query, a match­ing part of the text (if any) should ap­pear as a re­sult.

Hint 1: Every sub­string is a pre­fix of some suf­fix, so it suf­fices to store all suf­fixes of the text file.

Hint 2: Any suf­fix can be rep­re­sented com­pactly as a sin­gle in­te­ger in­di­cat­ing where the suf­fix be­gins in the text.

Test your ap­pli­ca­tion on some large texts, such as some of the books avail­able at Pro­ject Guten­berg [1]. If done cor­rectly, your ap­pli­ca­tions will be very re­spon­sive; there should be no no­tice­able lag be­tween typ­ing key­strokes and see­ing the re­sults.

Ex­er­cise 4.15 (This ex­er­cise should be done after read­ing about bi­nary search trees, in Sec­tion 6.2.) Com­pare skiplists with bi­nary search trees in the fol­low­ing ways:
  1. Ex­plain how re­mov­ing some edges of a skiplist leads to a struc­ture that looks like a bi­nary tree and is sim­i­lar to a bi­nary search tree.
  2. Skiplists and bi­nary search trees each use about the same num­ber of point­ers (2 per node). Skiplists make bet­ter use of those point­ers, though. Ex­plain why.