Subsections


5.1 ChainedHashTable: Hashing with Chaining

A ChainedHashTable data structure uses hashing with chaining to store data as an array, $ \ensuremath{\ensuremath{\mathit{t}}}$, of lists. An integer, $ \ensuremath{\ensuremath{\mathit{n}}}$, keeps track of the total number of items in all lists (see Figure 5.1):
\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{initialize}(...
...th{\ensuremath{\mathit{n}} \gets \ensuremath{0}}\\
\end{flushleft}\end{leftbar}

Figure 5.1: An example of a ChainedHashTable with $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}=14$ and $ \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{t}})}}=16$. In this example $ \ensuremath{\ensuremath{\mathrm{hash}(\ensuremath{\mathit{x}})}}=6$
\includegraphics[width=\textwidth ]{figs-python/chainedhashtable}
The hash value of a data item $ \ensuremath{\ensuremath{\mathit{x}}}$, denoted $ \ensuremath{\mathrm{hash}(\ensuremath{\mathit{x}})}$ is a value in the range $ \{0,\ldots,\ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{t}})}}-1\}$. All items with hash value $ \ensuremath{\ensuremath{\mathit{i}}}$ are stored in the list at $ \ensuremath{\ensuremath{\mathit{t}}[\ensuremath{\mathit{i}}]}$. To ensure that lists don't get too long, we maintain the invariant

$\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}} \le \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{t}})}}
$

so that the average number of elements stored in one of these lists is $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}/\ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{t}})}} \le 1$.

To add an element, $ \ensuremath{\ensuremath{\mathit{x}}}$, to the hash table, we first check if the length of $ \ensuremath{\ensuremath{\mathit{t}}}$ needs to be increased and, if so, we grow $ \ensuremath{\ensuremath{\mathit{t}}}$. With this out of the way we hash $ \ensuremath{\ensuremath{\mathit{x}}}$ to get an integer, $ \ensuremath{\ensuremath{\mathit{i}}}$, in the range $ \{0,\ldots,\ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{t}})}}-1\}$, and we append $ \ensuremath{\ensuremath{\mathit{x}}}$ to the list $ \ensuremath{\ensuremath{\mathit{t}}[\ensuremath{\mathit{i}}]}$:
\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{add}(\ensure...
...return}} \ensuremath{\ensuremath{\mathit{true}}}\\
\end{flushleft}\end{leftbar}
Growing the table, if necessary, involves doubling the length of $ \ensuremath{\ensuremath{\mathit{t}}}$ and reinserting all elements into the new table. This strategy is exactly the same as the one used in the implementation of ArrayStack and the same result applies: The cost of growing is only constant when amortized over a sequence of insertions (see Lemma 2.1 on page [*]).

Besides growing, the only other work done when adding a new value $ \ensuremath{\ensuremath{\mathit{x}}}$ to a ChainedHashTable involves appending $ \ensuremath{\ensuremath{\mathit{x}}}$ to the list $ \ensuremath{\ensuremath{\mathit{t}}[\mathrm{hash}(\ensuremath{\mathit{x}})]}$. For any of the list implementations described in Chapters 2 or 3, this takes only constant time.

To remove an element, $ \ensuremath{\ensuremath{\mathit{x}}}$, from the hash table, we iterate over the list $ \ensuremath{\ensuremath{\mathit{t}}[\mathrm{hash}(\ensuremath{\mathit{x}})]}$ until we find $ \ensuremath{\ensuremath{\mathit{x}}}$ so that we can remove it:
\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{remove}(\ens...
...\color{black} \textbf{return}} \ensuremath{nil} \\
\end{flushleft}\end{leftbar}
This takes $ O(\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_{\ensuremath{\ensuremath{\mathrm{hash}(\ensuremath{\mathit{x}})}}})$ time, where $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_{\ensuremath{\ensuremath{\ensuremath{\mathit{i}}}}}$ denotes the length of the list stored at $ \ensuremath{\ensuremath{\mathit{t}}[\ensuremath{\mathit{i}}]}$.

Searching for the element $ \ensuremath{\ensuremath{\mathit{x}}}$ in a hash table is similar. We perform a linear search on the list $ \ensuremath{\ensuremath{\mathit{t}}[\mathrm{hash}(\ensuremath{\mathit{x}})]}$:
\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{find}(\ensur...
...{return}} \ensuremath{\ensuremath{\mathit{nil}}}\\
\end{flushleft}\end{leftbar}
Again, this takes time proportional to the length of the list $ \ensuremath{\ensuremath{\mathit{t}}[\mathrm{hash}(\ensuremath{\mathit{x}})]}$.

The performance of a hash table depends critically on the choice of the hash function. A good hash function will spread the elements evenly among the $ \ensuremath{\mathrm{length}(\ensuremath{\mathit{t}})}$ lists, so that the expected size of the list $ \ensuremath{\ensuremath{\mathit{t}}[\mathrm{hash}(\ensuremath{\mathit{x}})]}$ is $ O(\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}/\ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{t}})})} = O(1)$. On the other hand, a bad hash function will hash all values (including $ \ensuremath{\ensuremath{\mathit{x}}}$) to the same table location, in which case the size of the list $ \ensuremath{\ensuremath{\mathit{t}}[\mathrm{hash}(\ensuremath{\mathit{x}})]}$ will be $ \ensuremath{\ensuremath{\mathit{n}}}$. In the next section we describe a good hash function.


5.1.1 Multiplicative Hashing

Multiplicative hashing is an efficient method of generating hash values based on modular arithmetic (discussed in Section 2.3) and integer division. It uses the $ \ddiv $ operator, which calculates the integral part of a quotient, while discarding the remainder. Formally, for any integers $ a\ge 0$ and $ b\ge 1$, $ a\ddiv b = \lfloor
a/b\rfloor$.

In multiplicative hashing, we use a hash table of size $ 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}}$ for some integer $ \ensuremath{\ensuremath{\mathit{d}}}$ (called the dimension). The formula for hashing an integer $ \ensuremath{\ensuremath{\ensuremath{\mathit{x}}}}\in\{0,\ldots,2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}-1\}$ is

$\displaystyle \ensuremath{\ensuremath{\mathrm{hash}(\ensuremath{\mathit{x}})}} ...
...th{\mathit{w}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}} \enspace .
$

Here, $ \ensuremath{\ensuremath{\mathit{z}}}$ is a randomly chosen odd integer in $ \{1,\ldots,2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}-1\}$. This hash function can be realized very efficiently by observing that, by default, operations on integers are already done modulo $ 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$ where $ \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}$ is the number of bits in an integer.5.1 (See Figure 5.2.) Furthermore, integer division by $ 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}}$ is equivalent to dropping the rightmost $ \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}$ bits in a binary representation (which is implemented by shifting the bits right by $ \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}$ using the $ \ensuremath{\gg}$ operator).
\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{hash}(\ensur...
...ensuremath{\mathit{w}}-\ensuremath{\mathit{d}})}\\
\end{flushleft}\end{leftbar}

Figure 5.2: The operation of the multiplicative hash function with $ \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}=32$ and $ \ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}=8$.
\begin{figure}\begin{center}
\resizebox{.98\textwidth}{!}{
\setlength{\arrayru...
... \end{tabular}}
\setlength{\arrayrulewidth}{.4pt}
\end{center}
\end{figure}

The following lemma, whose proof is deferred until later in this section, shows that multiplicative hashing does a good job of avoiding collisions:

Lemma 5..1   Let $ \ensuremath{\ensuremath{\mathit{x}}}$ and $ \ensuremath{\ensuremath{\mathit{y}}}$ be any two values in $ \{0,\ldots,2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}-1\}$ with $ \ensuremath{\ensuremath{\ensuremath{\mathit{x}}}}\neq \ensuremath{\ensuremath{\ensuremath{\mathit{y}}}}$. Then $ \Pr\{\ensuremath{\ensuremath{\mathrm{hash}(\ensuremath{\mathit{x}})}}=\ensurem...
...th{\mathit{y}})}}\} \le 2/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}}$.

With Lemma 5.1, the performance of $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{x}})}$, and $ \ensuremath{\mathrm{find}(\ensuremath{\mathit{x}})}$ are easy to analyze:

Lemma 5..2   For any data value $ \ensuremath{\ensuremath{\mathit{x}}}$, the expected length of the list $ \ensuremath{\ensuremath{\mathit{t}}[\mathrm{hash}(\ensuremath{\mathit{x}})]}$ is at most $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_{\ensuremath{\ensuremath{\ensuremath{\mathit{x}}}}} + 2$, where $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_{\ensuremath{\ensuremath{\ensuremath{\mathit{x}}}}}$ is the number of occurrences of $ \ensuremath{\ensuremath{\mathit{x}}}$ in the hash table.

Proof. Let $ S$ be the (multi-)set of elements stored in the hash table that are not equal to $ \ensuremath{\ensuremath{\mathit{x}}}$. For an element $ \ensuremath{\ensuremath{\ensuremath{\mathit{y}}}}\in S$, define the indicator variable

$\displaystyle I_{\ensuremath{\ensuremath{\ensuremath{\mathit{y}}}}} = \left\{\b...
...h}(\ensuremath{\mathit{y}})}}$} \\
0 & \mbox{otherwise}
\end{array}\right.
$

and notice that, by Lemma 5.1, $ \mathrm{E}[I_{\ensuremath{\ensuremath{\ensuremath{\mathit{y}}}}}] \le
2/2^{\e...
...hit{d}}}}}=2/\ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{t}})}}$. The expected length of the list $ \ensuremath{\ensuremath{\mathit{t}}[\mathrm{hash}(\ensuremath{\mathit{x}})]}$ is given by
$\displaystyle \mathrm{E}\left[\ensuremath{\ensuremath{\ensuremath{\mathit{t}}[\mathrm{hash}(\ensuremath{\mathit{x}})].\mathrm{size}()}}\right]$ $\displaystyle =$ $\displaystyle \mathrm{E}\left[\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}...
...mathit{y}}}}\in S} I_{\ensuremath{\ensuremath{\ensuremath{\mathit{y}}}}}\right]$  
  $\displaystyle =$ $\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_{\ensuremath{\e...
...{y}}}}\in S} \mathrm{E}[I_{\ensuremath{\ensuremath{\ensuremath{\mathit{y}}}}} ]$  
  $\displaystyle \le$ $\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_{\ensuremath{\e...
...}}}}\in S} 2/\ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{t}})}}$  
  $\displaystyle \le$ $\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_{\ensuremath{\e...
...uremath{\mathit{y}}}}\in S} 2/\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}$  
  $\displaystyle \le$ $\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_{\ensuremath{\e...
...{\ensuremath{\mathit{x}}}}})2/\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}$  
  $\displaystyle \le$ $\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_{\ensuremath{\ensuremath{\ensuremath{\mathit{x}}}}} + 2 \enspace ,$  

as required. $ \qedsymbol$

Now, we want to prove Lemma 5.1, but first we need a result from number theory. In the following proof, we use the notation $ (b_r,\ldots,b_0)_2$ to denote $ \sum_{i=0}^r b_i2^i$, where each $ b_i$ is a bit, either 0 or 1. In other words, $ (b_r,\ldots,b_0)_2$ is the integer whose binary representation is given by $ b_r,\ldots,b_0$. We use $ \star$ to denote a bit of unknown value.

Lemma 5..3   Let $ S$ be the set of odd integers in $ \{1,\ldots,2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}-1\}$; let $ q$ and $ i$ be any two elements in $ S$. Then there is exactly one value $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}\in S$ such that $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}q\bmod 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}} = i$.

Proof. Since the number of choices for $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}$ and $ i$ is the same, it is sufficient to prove that there is at most one value $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}\in S$ that satisfies $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}q\bmod 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}} = i$.

Suppose, for the sake of contradiction, that there are two such values $ \ensuremath{\ensuremath{\mathit{z}}}$ and $ \ensuremath{z}'$, with $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}>\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}'$. Then

$\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}q\bmod 2^{\ensur...
...athit{z}}}}'q \bmod 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}} = i
$

So

$\displaystyle (\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}-\ensuremath{\e...
...thit{z}}}}')q\bmod 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}} = 0
$

But this means that

$\displaystyle (\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}-\ensuremath{\e...
...math{\mathit{z}}}}')q = k 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$ (5.1)

for some integer $ k$. Thinking in terms of binary numbers, we have

$\displaystyle (\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}-\ensuremath{\e...
...0,\ldots,0}_{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}})_2 \enspace ,
$

so that the $ \ensuremath{\ensuremath{\mathit{w}}}$ trailing bits in the binary representation of $ (\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}')q$ are all 0's.

Furthermore $ k\neq 0$, since $ q\neq 0$ and $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}'\neq 0$. Since $ q$ is odd, it has no trailing 0's in its binary representation:

$\displaystyle q = (\star,\ldots,\star,1)_2 \enspace .
$

Since $ \vert\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}-\ensuremath{\ensuremath...
...ath{\mathit{z}}}}'\vert < 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$, $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}'$ has fewer than $ \ensuremath{\ensuremath{\mathit{w}}}$ trailing 0's in its binary representation:

$\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}-\ensuremath{\en...
...\ldots,0}_{<\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}})_2
\enspace .
$

Therefore, the product $ (\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}')q$ has fewer than $ \ensuremath{\ensuremath{\mathit{w}}}$ trailing 0's in its binary representation:

$\displaystyle (\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}-\ensuremath{\e...
...ldots,0}_{<\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}})_2
\enspace .
$

Therefore $ (\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}')q$ cannot satisfy (5.1), yielding a contradiction and completing the proof. $ \qedsymbol$

The utility of Lemma 5.3 comes from the following observation: If $ \ensuremath{\ensuremath{\mathit{z}}}$ is chosen uniformly at random from $ S$, then $ \ensuremath{\ensuremath{\mathit{zt}}}$ is uniformly distributed over $ S$. In the following proof, it helps to think of the binary representation of $ \ensuremath{\ensuremath{\mathit{z}}}$, which consists of $ \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}-1$ random bits followed by a 1.

Proof. [Proof of Lemma 5.1] First we note that the condition $ \ensuremath{\ensuremath{\mathrm{hash}(\ensuremath{\mathit{x}})}}=\ensuremath{\ensuremath{\mathrm{hash}(\ensuremath{\mathit{y}})}}$ is equivalent to the statement ``the highest-order $ \ensuremath{\ensuremath{\mathit{d}}}$ bits of $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}} \ensuremath{\ensuremath{\ensuremath{\mathit{x}}}}\bmod2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$ and the highest-order $ \ensuremath{\ensuremath{\mathit{d}}}$ bits of $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}} \ensuremath{\ensuremath{\ensuremath{\mathit{y}}}}\bmod 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$ are the same.'' A necessary condition of that statement is that the highest-order $ \ensuremath{\ensuremath{\mathit{d}}}$ bits in the binary representation of $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}(\ensuremath{\ensuremath{\ens...
...emath{\mathit{y}}}})\bmod 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$ are either all 0's or all 1's. That is,

$\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}(\ensuremath{\en...
...\ensuremath{\mathit{w}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}})_2$ (5.2)

when $ \ensuremath{\ensuremath{\ensuremath{\mathit{zx}}}}\bmod 2^{\ensuremath{\ensure...
...emath{\mathit{zy}}}}\bmod 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$ or

$\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}(\ensuremath{\en...
...{\mathit{w}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}})_2 \enspace .$ (5.3)

when $ \ensuremath{\ensuremath{\ensuremath{\mathit{zx}}}}\bmod 2^{\ensuremath{\ensure...
...emath{\mathit{zy}}}}\bmod 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$. Therefore, we only have to bound the probability that $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}(\ensuremath{\ensuremath{\ens...
...emath{\mathit{y}}}})\bmod 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$ looks like (5.2) or (5.3).

Let $ q$ be the unique odd integer such that $ (\ensuremath{\ensuremath{\ensuremath{\mathit{x}}}}-\ensuremath{\ensuremath{\en...
...\mathit{y}}}})\bmod
2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}=q2^r$ for some integer $ r\ge 0$. By Lemma 5.3, the binary representation of $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}q\bmod
2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$ has $ \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}-1$ random bits, followed by a 1:

$\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}q\bmod 2^{\ensur...
...}-1},\ldots,b_{1}}_{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}-1},1)_2
$

Therefore, the binary representation of $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}(\ensuremath{\ensuremath{\ens...
...th{\mathit{z}}}}q2^r\bmod 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$ has $ \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}-r-1$ random bits, followed by a 1, followed by $ r$ 0's:

$\displaystyle \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}(\ensuremath{\en...
...\ensuremath{\ensuremath{\mathit{w}}}}-r-1},1,\underbrace{0,0,\ldots,0}_{r})_2
$

We can now finish the proof: If $ r > \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}$, then the $ \ensuremath{\ensuremath{\mathit{d}}}$ higher order bits of $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}(\ensuremath{\ensuremath{\ens...
...emath{\mathit{y}}}})\bmod 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$ contain both 0's and 1's, so the probability that $ \ensuremath{\ensuremath{\ensuremath{\mathit{z}}}}(\ensuremath{\ensuremath{\ens...
...emath{\mathit{y}}}})\bmod 2^{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}}$ looks like (5.2) or (5.3) is 0. If $ \ensuremath{\ensuremath{\ensuremath{\mathit{r}}}}=\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}$, then the probability of looking like (5.2) is 0, but the probability of looking like (5.3) is $ 1/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}-1}=2/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}}$ (since we must have $ b_1,\ldots,b_{d-1}=1,\ldots,1$). If $ r < \ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}$, then we must have $ b_{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}-r-1},\ldots,b_{\ensuremat...
...h{\mathit{w}}}}-r-\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}}=0,\ldots,0$ or $ b_{\ensuremath{\ensuremath{\ensuremath{\mathit{w}}}}-r-1},\ldots,b_{\ensuremat...
...h{\mathit{w}}}}-r-\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}}=1,\ldots,1$. The probability of each of these cases is $ 1/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}}$ and they are mutually exclusive, so the probability of either of these cases is $ 2/2^{\ensuremath{\ensuremath{\ensuremath{\mathit{d}}}}}$. This completes the proof. $ \qedsymbol$

5.1.2 Summary

The following theorem summarizes the performance of a ChainedHashTable data structure:

Theorem 5..1   A ChainedHashTable implements the USet interface. Ignoring the cost of calls to $ \ensuremath{\mathrm{grow}()}$, a ChainedHashTable supports the operations $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{x}})}$, $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{x}})}$, and $ \ensuremath{\mathrm{find}(\ensuremath{\mathit{x}})}$ in $ O(1)$ expected time per operation.

Furthermore, beginning with an empty ChainedHashTable, any sequence of $ m$ $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{x}})}$ and $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{x}})}$ operations results in a total of $ O(m)$ time spent during all calls to $ \ensuremath{\mathrm{grow}()}$.



Footnotes

... integer.5.1
This is true for most programming languages including C, C#, C++, and Java. Notable exceptions are Python and Ruby, in which the result of a fixed-length $ \ensuremath{\ensuremath{\mathit{w}}}$-bit integer operation that overflows is upgraded to a variable-length representation.
opendatastructures.org