4.4 Analysis of Skiplists

In this section, we analyze the expected height, size, and length of the search path in a skiplist. This section requires a background in basic probability. Several proofs are based on the following basic observation about coin tosses.

Proof. Suppose we stop tossing the coin the first time it comes up heads. Define the indicator variable

$\displaystyle I_{i} = \left\{\begin{array}{ll} 0 & \mbox{if the coin is tossed... ...\\ 1 & \mbox{if the coin is tossed $i$\ or more times} \end{array}\right.$

Note that

if and only if the first

coin tosses are tails, so $\mathrm{E}[I_i]=\Pr\{I_i=1\}=1/2^{i-1}$ . Observe that

, the total number of coin tosses, can be written as $T=\sum_{i=1}^{\infty} I_i$ . Therefore,

$\displaystyle \mathrm{E}[T]$	$\displaystyle = \mathrm{E}\left[\sum_{i=1}^\infty I_i\right]$
	$\displaystyle = \sum_{i=1}^\infty \mathrm{E}\left[I_i\right]$
	$\displaystyle = \sum_{i=1}^\infty 1/2^{i-1}$
	$\displaystyle = 1 + 1/2 + 1/4 + 1/8 + \cdots$
	$\displaystyle = 2 \enspace . \qedhere$

$\qedsymbol$

Proof. The probability that any particular element, $\ensuremath{\ensuremath{\mathit{x}}}$ , is included in list $L_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$ is $1/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$ , so the expected number of nodes in $L_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$ is $\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$ .^4.2 Therefore, the total expected number of nodes in all lists is

$\displaystyle \sum_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}=0}^\infty... ...dots) = 2\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}} \enspace . \qedhere$

$\qedsymbol$

Proof. For each $\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}\in\{1,2,3,\ldots,\infty\}$ , define the indicator random variable

$\displaystyle I_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}} = \left\{\b... ...th{\ensuremath{\ensuremath{\mathit{r}}}}}$\ is non-empty} \end{array}\right.$

The height, $\ensuremath{\ensuremath{\mathit{h}}}$ , of the skiplist is then given by

$\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{h}}}} = \sum_{i=1}^\infty I_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}} \enspace .$

Note that $I_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$ is never more than the length, $\vert L_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}\vert$ , of $L_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$ , so

$\displaystyle \mathrm{E}[I_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}]... ...mathit{n}}}}/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}} \enspace .$

Therefore, we have

$\displaystyle \mathrm{E}[\ensuremath{\ensuremath{\ensuremath{\mathit{h}}}}]$	$\displaystyle = \mathrm{E}\left[\sum_{r=1}^\infty I_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}\right]$
	$\displaystyle = \sum_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}=1}^{\infty} E[I_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}]$
	$\displaystyle = \sum_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}=1}^{\lf... ...}}}\rfloor+1}^{\infty} E[I_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}]$
	$\displaystyle \le \sum_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}=1}^{\... ...\ensuremath{\mathit{n}}}}/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$
	$\displaystyle \le \log \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}} + \sum... ...\mathit{r}}}}=0}^\infty 1/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$
	$\displaystyle = \log \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}} + 2 \enspace . \qedhere$

$\qedsymbol$

Proof. The easiest way to see this is to consider the reverse search path for a node, $\ensuremath{\ensuremath{\mathit{x}}}$ . This path starts at the predecessor of $\ensuremath{\ensuremath{\mathit{x}}}$ in

. At any point in time, if the path can go up a level, then it does. If it cannot go up a level then it goes left. Thinking about this for a few moments will convince us that the reverse search path for $\ensuremath{\ensuremath{\mathit{x}}}$ is identical to the search path for $\ensuremath{\ensuremath{\mathit{x}}}$ , except that it is reversed.

The number of nodes that the reverse search path visits at a particular level, $\ensuremath{\ensuremath{\mathit{r}}}$ , is related to the following experiment: Toss a coin. If the coin comes up as heads, then move up and stop. Otherwise, move left and repeat the experiment. The number of coin tosses before the heads represents the number of steps to the left that a reverse search path takes at a particular level.^4.3 Lemma 4.2 tells us that the expected number of coin tosses before the first heads is 1.

Let $S_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$ denote the number of steps the forward search path takes at level $\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}$ that go to the right. We have just argued that $\mathrm{E}[S_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}]\le 1$ . Furthermore, $S_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}\le \vert L_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}\vert$ , since we can't take more steps in $L_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$ than the length of $L_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$ , so

$\displaystyle \mathrm{E}[S_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}]... ...mathit{n}}}}/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}} \enspace .$

We can now finish as in the proof of Lemma 4.4. Let

be the length of the search path for some node, $\ensuremath{\ensuremath{\mathit{u}}}$ , in a skiplist, and let $\ensuremath{\ensuremath{\ensuremath{\mathit{h}}}}$ be the height of the skiplist. Then

$\displaystyle \mathrm{E}[S]$	$\displaystyle = \mathrm{E}\left[ \ensuremath{\ensuremath{\ensuremath{\mathit{h}... ...t{r}}}}=0}^\infty S_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}} \right]$
	$\displaystyle = \mathrm{E}[\ensuremath{\ensuremath{\ensuremath{\mathit{h}}}}] +... ...}}}=0}^\infty \mathrm{E}[S_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}]$
	$\displaystyle = \mathrm{E}[\ensuremath{\ensuremath{\ensuremath{\mathit{h}}}}] +... ...oor+1}^\infty \mathrm{E}[S_{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}]$
	$\displaystyle \le \mathrm{E}[\ensuremath{\ensuremath{\ensuremath{\mathit{h}}}}]... ...\ensuremath{\mathit{n}}}}/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$
	$\displaystyle \le \mathrm{E}[\ensuremath{\ensuremath{\ensuremath{\mathit{h}}}}]... ...athit{r}}}}=0}^{\infty} 1/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$
	$\displaystyle \le \mathrm{E}[\ensuremath{\ensuremath{\ensuremath{\mathit{h}}}}]... ...athit{r}}}}=0}^{\infty} 1/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}}$
	$\displaystyle \le \mathrm{E}[\ensuremath{\ensuremath{\ensuremath{\mathit{h}}}}] + \log \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}} + 3$
	$\displaystyle \le 2\log \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}} + 5 \enspace . \qedhere$

$\qedsymbol$

4.4 Analysis of Skiplists

Footnotes