4.5 Discussion and Exercises

Skiplists were introduced by Pugh [61] who also presented a number of applications of skiplists [60]. Since then they have been studied extensively. Several researchers have done very precise analysis of the expected length and variance in length of the search path for the $\mathtt{i}$ th element in a skiplist [45,44,58]. Deterministic versions [53], biased versions [8,26], and self-adjusting versions [12] of skiplists have all been developed. Skiplist implementations have been written for various languages and frameworks and have seen use in open-source database systems [69,63]. A variant of skiplists is used in the HP-UX operating system kernel's process management structures [42]. Skiplists are even part of the Java 1.6 API [55].

Exercise 4..7 Suppose that, instead of promoting an element from $L_{i-1}$ into

based on a coin toss, we promote it with some probability

Show that the expected length of a search path is at most $(1/p)\log_{1/p} \ensuremath{\mathtt{n}} + O(1)$ .
What is the value of that minimizes the preceding expression?
What is the expected height of the skiplist?
What is the expected number of nodes in the skiplist?

Exercise 4..8 The $\mathtt{find(x)}$ method in a SkiplistSet sometimes performs redundant comparisons; these occur when $\mathtt{x}$ is compared to the same value more than once. They can occur when, for some node, $\mathtt{u}$ , $\ensuremath{\mathtt{u.next[r]}} = \ensuremath{\mathtt{u.next[r-1]}}$ . Show how these redundant comparisons happen and modify $\mathtt{find(x)}$ so that they are avoided. Analyze the expected number of comparisons done by your modified $\mathtt{find(x)}$ method.

Exercise 4..9 Design and implement a version of a skiplist that implements the SSet interface, but also allows fast access to elements by rank. That is, it also supports the function $\mathtt{get(i)}$ , which returns the element whose rank is $\mathtt{i}$ in $O(\log \ensuremath{\mathtt{n}})$ expected time. (The rank of an element $\mathtt{x}$ in an SSet is the number of elements in the SSet that are less than $\mathtt{x}$ .)

Exercise 4..10 A finger in a skiplist is an array that stores the sequence of nodes on a search path at which the search path goes down. (The variable $\mathtt{stack}$ in the $\mathtt{add(x)}$ code on page

is a finger; the shaded nodes in Figure 4.3 show the contents of the finger.) One can think of a finger as pointing out the path to a node in the lowest list,

A finger search implements the $\mathtt{find(x)}$ operation using a finger, by walking up the list using the finger until reaching a node $\mathtt{u}$ such that $\ensuremath{\mathtt{u.x}} < \ensuremath{\mathtt{x}}$ and $\ensuremath{\mathtt{u.next}}=\ensuremath{\mathtt{null}}$ or $\ensuremath{\mathtt{u.next.x}} > \ensuremath{\mathtt{x}}$ and then performing a normal search for $\mathtt{x}$ starting from $\mathtt{u}$ . It is possible to prove that the expected number of steps required for a finger search is $O(1+\log r)$ , where is the number values in between $\mathtt{x}$ and the value pointed to by the finger.

Implement a subclass of Skiplist called SkiplistWithFinger that does all $\mathtt{find(x)}$ operations using an internal finger. This class stores a finger and does every search as a finger search. During the search it also updates the finger so that each $\mathtt{find(x)}$ operation uses, as a starting point, a finger that points to the result of the previous $\mathtt{find(x)}$ operation.

Exercise 4..11 Write a method, $\mathtt{truncate(i)}$ , that truncates a SkiplistList at position $\mathtt{i}$ . After the execution of this method, the size of the list is $\mathtt{i}$ and it contains only the elements at indices $0,\ldots,\ensuremath{\mathtt{i}}-1$ . The return value is another SkiplistList that contains the elements at indices $\ensuremath{\mathtt{i}},\ldots,\ensuremath{\mathtt{n}}-1$ . This method should run in $O(\log \ensuremath{\mathtt{n}})$ time.

Exercise 4..12 Write a SkiplistList method, $\mathtt{absorb(l2)}$ , that takes as an argument a SkiplistList, $\mathtt{l2}$ , empties it and appends its contents, in order, to the receiver. For example, if $\mathtt{l1}$ contains

and $\mathtt{l2}$ contains

, then after calling $\mathtt{l1.absorb(l2)}$ , $\mathtt{l1}$ will contain

and $\mathtt{l2}$ will be empty. This method should run in $O(\log \ensuremath{\mathtt{n}})$ time.

Exercise 4..13 Using the ideas from the space-efficient linked-list, SEList, design and implement a space-efficient SSet, SESSet. Do this by storing the data, in order, in an SEList and then storing the blocks of this SEList in an SSet. If the original SSet implementation uses $O(\ensuremath{\mathtt{n}})$ space to store $\mathtt{n}$ elements, then the SESSet will use enough space for $\mathtt{n}$ elements plus $O(\ensuremath{\mathtt{n}}/\ensuremath{\mathtt{b}}+\ensuremath{\mathtt{b}})$ wasted space.

Exercise 4..14 Using an SSet as your underlying structure, design and implement an application that reads a (large) text file and allow you to search, interactively, for any substring contained in the text. As the user types their query, a matching part of the text (if any) should appear as a result.

Hint 1: Every substring is a prefix of some suffix, so it suffices to store all suffixes of the text file.

Hint 2: Any suffix can be represented compactly as a single integer indicating where the suffix begins in the text.

Test your application on some large texts like some of the books available at Project Gutenberg [1]. If done correctly, your applications will be very responsive; there should be no noticeable lag between typing keystrokes and the results appearing.

Exercise 4..15 (This excercise is to be done after reading about binary search trees, in Section 6.2.) Compare skiplists with binary search trees in the following ways:

Explain how removing some edges of a skiplist lead to a structure that looks like a binary tree and that is similar to a binary search tree.
Skiplists and binary search trees each use about the same number of pointers (2 per node). Skiplists make better use of those pointers, though. Explain why.