Subsections


2.1 ArrayStack: Fast Stack Operations Using an Array

An ArrayStack implements the list interface using an array $ \ensuremath{\ensuremath{\mathit{a}}}$, called the backing array. The list element with index $ \ensuremath{\ensuremath{\mathit{i}}}$ is stored in $ \ensuremath{\ensuremath{\mathit{a}}[\ensuremath{\mathit{i}}]}$. At most times, $ \ensuremath{\ensuremath{\mathit{a}}}$ is larger than strictly necessary, so an integer $ \ensuremath{\ensuremath{\mathit{n}}}$ is used to keep track of the number of elements actually stored in $ \ensuremath{\ensuremath{\mathit{a}}}$. In this way, the list elements are stored in $ \ensuremath{\ensuremath{\mathit{a}}[0]}$,..., $ \ensuremath{\ensuremath{\mathit{a}}[\ensuremath{\mathit{n}}-1]}$ and, at all times, $ \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}} \ge \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}$.


\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{initialize}(...
...th{\ensuremath{\mathit{n}} \gets \ensuremath{0}}\\
\end{flushleft}\end{leftbar}

2.1.1 The Basics

Accessing and modifying the elements of an ArrayStack using $ \ensuremath{\mathrm{get}(\ensuremath{\mathit{i}})}$ and $ \ensuremath{\mathrm{set}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ is trivial. After performing any necessary bounds-checking we simply return or set, respectively, $ \ensuremath{\ensuremath{\mathit{a}}[\ensuremath{\mathit{i}}]}$.


\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{get}(\ensure...
...bf{return}} \ensuremath{\ensuremath{\mathit{y}}}\\
\end{flushleft}\end{leftbar}

The operations of adding and removing elements from an ArrayStack are illustrated in Figure 2.1. To implement the $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ operation, we first check if $ \ensuremath{\ensuremath{\mathit{a}}}$ is already full. If so, we call the method $ \ensuremath{\mathrm{resize}()}$ to increase the size of $ \ensuremath{\ensuremath{\mathit{a}}}$. How $ \ensuremath{\mathrm{resize}()}$ is implemented will be discussed later. For now, it is sufficient to know that, after a call to $ \ensuremath{\mathrm{resize}()}$, we can be sure that $ \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}}
> \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}$. With this out of the way, we now shift the elements $ \ensuremath{\ensuremath{\ensuremath{\mathit{a}}[\ensuremath{\mathit{i}}]}},\ldots,\ensuremath{\ensuremath{\ensuremath{\mathit{a}}[\ensuremath{\mathit{n}}-1]}}$ right by one position to make room for $ \ensuremath{\ensuremath{\mathit{x}}}$, set $ \ensuremath{\ensuremath{\mathit{a}}[\ensuremath{\mathit{i}}]}$ equal to $ \ensuremath{\ensuremath{\mathit{x}}}$, and increment $ \ensuremath{\ensuremath{\mathit{n}}}$.

Figure 2.1: A sequence of $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ and $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{i}})}$ operations on an ArrayStack. Arrows denote elements being copied. Operations that result in a call to $ \ensuremath{\mathrm{resize}()}$ are marked with an asterisk.
\includegraphics[scale=0.90909]{figs-python/arraystack}


\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{add}(\ensure...
... \gets \ensuremath{\ensuremath{\mathit{n}} + 1}}\\
\end{flushleft}\end{leftbar}
If we ignore the cost of the potential call to $ \ensuremath{\mathrm{resize}()}$, then the cost of the $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ operation is proportional to the number of elements we have to shift to make room for $ \ensuremath{\ensuremath{\mathit{x}}}$. Therefore the cost of this operation (ignoring the cost of resizing $ \ensuremath{\ensuremath{\mathit{a}}}$) is $ O(\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{i}}}})$.

Implementing the $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{i}})}$ operation is similar. We shift the elements $ \ensuremath{\ensuremath{\ensuremath{\mathit{a}}[\ensuremath{\mathit{i}}+1]}},\ldots,\ensuremath{\ensuremath{\ensuremath{\mathit{a}}[\ensuremath{\mathit{n}}-1]}}$ left by one position (overwriting $ \ensuremath{\ensuremath{\mathit{a}}[\ensuremath{\mathit{i}}]}$) and decrease the value of $ \ensuremath{\ensuremath{\mathit{n}}}$. After doing this, we check if $ \ensuremath{\ensuremath{\mathit{n}}}$ is getting much smaller than $ \ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}$ by checking if $ \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}} \ge 3\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}$. If so, then we call $ \ensuremath{\mathrm{resize}()}$ to reduce the size of $ \ensuremath{\ensuremath{\mathit{a}}}$.


\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{remove}(\ens...
...bf{return}} \ensuremath{\ensuremath{\mathit{x}}}\\
\end{flushleft}\end{leftbar}
If we ignore the cost of the $ \ensuremath{\mathrm{resize}()}$ method, the cost of a $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{i}})}$ operation is proportional to the number of elements we shift, which is $ O(\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}-\ensuremath{\ensuremath{\ensuremath{\mathit{i}}}})$.

2.1.2 Growing and Shrinking

The $ \ensuremath{\mathrm{resize}()}$ method is fairly straightforward; it allocates a new array $ \ensuremath{\ensuremath{\mathit{b}}}$ whose size is $ 2\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}$ and copies the $ \ensuremath{\ensuremath{\mathit{n}}}$ elements of $ \ensuremath{\ensuremath{\mathit{a}}}$ into the first $ \ensuremath{\ensuremath{\mathit{n}}}$ positions in $ \ensuremath{\ensuremath{\mathit{b}}}$, and then sets $ \ensuremath{\ensuremath{\mathit{a}}}$ to $ \ensuremath{\ensuremath{\mathit{b}}}$. Thus, after a call to $ \ensuremath{\mathrm{resize}()}$, $ \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}} = 2\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}$.


\begin{leftbar}
\begin{flushleft}
\hspace*{1em} \ensuremath{\mathrm{resize}()}\\...
...th{\ensuremath{\mathit{a}} \gets \ensuremath{b}}\\
\end{flushleft}\end{leftbar}

Analyzing the actual cost of the $ \ensuremath{\mathrm{resize}()}$ operation is easy. It allocates an array $ \ensuremath{\ensuremath{\mathit{b}}}$ of size $ 2\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}$ and copies the $ \ensuremath{\ensuremath{\mathit{n}}}$ elements of $ \ensuremath{\ensuremath{\mathit{a}}}$ into $ \ensuremath{\ensuremath{\mathit{b}}}$. This takes $ O(\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}})$ time.

The running time analysis from the previous section ignored the cost of calls to $ \ensuremath{\mathrm{resize}()}$. In this section we analyze this cost using a technique known as amortized analysis. This technique does not try to determine the cost of resizing during each individual $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ and $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{i}})}$ operation. Instead, it considers the cost of all calls to $ \ensuremath{\mathrm{resize}()}$ during a sequence of $ m$ calls to $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ or $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{i}})}$. In particular, we will show:

Lemma 2..1   If an empty ArrayStack is created and any sequence of $ m\ge 1$ calls to $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ and $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{i}})}$ are performed, then the total time spent during all calls to $ \ensuremath{\mathrm{resize}()}$ is $ O(m)$.

Proof. We will show that any time $ \ensuremath{\mathrm{resize}()}$ is called, the number of calls to $ \ensuremath{\ensuremath{\mathit{add}}}$ or $ \ensuremath{\ensuremath{\mathit{remove}}}$ since the last call to $ \ensuremath{\mathrm{resize}()}$ is at least $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}/2-1$. Therefore, if $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i$ denotes the value of $ \ensuremath{\ensuremath{\mathit{n}}}$ during the $ i$th call to $ \ensuremath{\mathrm{resize}()}$ and $ r$ denotes the number of calls to $ \ensuremath{\mathrm{resize}()}$, then the total number of calls to $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ or $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{i}})}$ is at least

$\displaystyle \sum_{i=1}^{r} (\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i/2-1) \le m \enspace ,
$

which is equivalent to

$\displaystyle \sum_{i=1}^{r} \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i \le 2m + 2r \enspace .
$

On the other hand, the total time spent during all calls to $ \ensuremath{\mathrm{resize}()}$ is

$\displaystyle \sum_{i=1}^{r} O(\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i) \le O(m+r) = O(m) \enspace ,
$

since $ r$ is not more than $ m$. All that remains is to show that the number of calls to $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ or $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{i}})}$ between the $ (i-1)$th and the $ i$th call to $ \ensuremath{\mathrm{resize}()}$ is at least $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i/2$.

There are two cases to consider. In the first case, $ \ensuremath{\mathrm{resize}()}$ is being called by $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ because the backing array $ \ensuremath{\ensuremath{\mathit{a}}}$ is full, i.e., $ \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}} = \ensurema...
...h{\ensuremath{\mathit{n}}}}=\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i$. Consider the previous call to $ \ensuremath{\mathrm{resize}()}$: after this previous call, the size of $ \ensuremath{\ensuremath{\mathit{a}}}$ was $ \ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}$, but the number of elements stored in $ \ensuremath{\ensuremath{\mathit{a}}}$ was at most $ \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}}/2=\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i/2$. But now the number of elements stored in $ \ensuremath{\ensuremath{\mathit{a}}}$ is $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i=\ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}}$, so there must have been at least $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i/2$ calls to $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ since the previous call to $ \ensuremath{\mathrm{resize}()}$.

The second case occurs when $ \ensuremath{\mathrm{resize}()}$ is being called by $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{i}})}$ because $ \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}} \ge 3\ensur...
...{\ensuremath{\mathit{n}}}}=3\ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i$. Again, after the previous call to $ \ensuremath{\mathrm{resize}()}$ the number of elements stored in $ \ensuremath{\ensuremath{\mathit{a}}}$ was at least $ \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})/2}}-1$.2.1 Now there are $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i\le\ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}}/3$ elements stored in $ \ensuremath{\ensuremath{\mathit{a}}}$. Therefore, the number of $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{i}})}$ operations since the last call to $ \ensuremath{\mathrm{resize}()}$ is at least

$\displaystyle R$ $\displaystyle \ge \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}}/2 - 1 - \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}}/3$    
  $\displaystyle = \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}}/6 - 1$    
  $\displaystyle = (\ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}}/3)/2 - 1$    
  $\displaystyle \ge \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i/2 -1\enspace .$    

In either case, the number of calls to $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ or $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{i}})}$ that occur between the $ (i-1)$th call to $ \ensuremath{\mathrm{resize}()}$ and the $ i$th call to $ \ensuremath{\mathrm{resize}()}$ is at least $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}_i/2-1$, as required to complete the proof. $ \qedsymbol$

2.1.3 Summary

The following theorem summarizes the performance of an ArrayStack:

Theorem 2..1   An ArrayStack implements the List interface. Ignoring the cost of calls to $ \ensuremath{\mathrm{resize}()}$, an ArrayStack supports the operations Furthermore, beginning with an empty ArrayStack and performing any sequence of $ m$ $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{i}},\ensuremath{\mathit{x}})}$ and $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{i}})}$ operations results in a total of $ O(m)$ time spent during all calls to $ \ensuremath{\mathrm{resize}()}$.

The ArrayStack is an efficient way to implement a Stack. In particular, we can implement $ \ensuremath{\mathrm{push}(\ensuremath{\mathit{x}})}$ as $ \ensuremath{\mathrm{add}(\ensuremath{\mathit{n}},\ensuremath{\mathit{x}})}$ and $ \ensuremath{\mathrm{pop}()}$ as $ \ensuremath{\mathrm{remove}(\ensuremath{\mathit{n}}-1)}$, in which case these operations will run in $ O(1)$ amortized time.



Footnotes

....2.1
The $ {}-1$ in this formula accounts for the special case that occurs when $ \ensuremath{\ensuremath{\ensuremath{\mathit{n}}}}=0$ and $ \ensuremath{\ensuremath{\mathrm{length}(\ensuremath{\mathit{a}})}} = 1$.
opendatastructures.org