Subsections


2.5 DualArrayDeque: Building a Deque from Two Stacks

Next, we present another data structure, the DualArrayDeque that achieves the same performance bounds as an ArrayDeque by using two ArrayStacks. Although the asymptotic performance of the DualArrayDeque is no better than that of the ArrayDeque, it is still worth studying since it offers a good example of how to make a sophisticated data structure by combining two simpler data structures.

A DualArrayDeque represents a list using two ArrayStacks. Recall that an ArrayStack is fast when the operations on it modify elements near the end. A DualArrayDeque places two ArrayStacks, called $ \mathtt{front}$ and $ \mathtt{back}$, back-to-back so that operations are fast at either end.

    List<T> front;
    List<T> back;

A DualArrayDeque does not explicitly store the number, $ \mathtt{n}$, of elements it contains. It doesn't need to, since it contains $ \ensuremath{\mathtt{n}}=\ensuremath{\mathtt{front.size()}} + \ensuremath{\mathtt{back.size()}}$ elements. Nevertheless, when analyzing the DualArrayDeque we will still use $ \mathtt{n}$ to denote the number of elements it contains.

    int size() {
        return front.size() + back.size();        
    }

The $ \mathtt{front}$ ArrayStack contains list elements with indices $ 0,\ldots,\ensuremath{\mathtt{front.size()}}-1$, but stores them in reverse order. The $ \mathtt{back}$ ArrayStack contains list elements with indices $ \ensuremath{\mathtt{front.size()}},\ldots,\ensuremath{\mathtt{size()}}-1$ in the normal order. In this way, $ \mathtt{get(i)}$ and $ \mathtt{set(i,x)}$ translate into appropriate calls to $ \mathtt{get(i)}$ or $ \mathtt{set(i,x)}$ on either $ \mathtt{front}$ or $ \mathtt{back}$, which take $ O(1)$ time per operation.

    T get(int i) {
        if (i < front.size()) {
            return front.get(front.size()-i-1);
        } else {
            return back.get(i-front.size());
        }
    }
    T set(int i, T x) {
        if (i < front.size()) {
            return front.set(front.size()-i-1, x);
            
        } else {
            return back.set(i-front.size(), x);
        }
    }

Note that, if an index $ \ensuremath{\mathtt{i}}<\ensuremath{\mathtt{front.size()}}$, then it corresponds to the element of $ \mathtt{front}$ at position $ \ensuremath{\mathtt{front.size()}}-\ensuremath{\mathtt{i}}-1$, since the elements of $ \mathtt{front}$ are stored in reverse order.

Adding and removing elements from a DualArrayDeque is illustrated in Figure 2.4. The $ \mathtt{add(i,x)}$ operation manipulates either $ \mathtt{front}$ or $ \mathtt{back}$, as appropriate:

Figure 2.4: A sequence of $ \mathtt{add(i,x)}$ and $ \mathtt{remove(i)}$ operations on a DualArrayDeque. Arrows denote elements being copied. Operations that result in a rebalancing by $ \mathtt{balance()}$ are marked with an asterisk.
\includegraphics{figs/dualarraydeque}

    void add(int i, T x) {
        if (i < front.size()) { 
            front.add(front.size()-i, x);
        } else {
            back.add(i-front.size(), x);
        }
        balance();
    }

The $ \mathtt{add(i,x)}$ method performs rebalancing of the two ArrayStacks $ \mathtt{front}$ and $ \mathtt{back}$, by calling the $ \mathtt{balance()}$ method. The implementation of $ \mathtt{balance()}$ is described below, but for now it is sufficient to know that $ \mathtt{balance()}$ ensures that, unless $ \ensuremath{\mathtt{size()}}<2$, $ \mathtt{front.size()}$ and $ \mathtt{back.size()}$ do not differ by more than a factor of 3. In particular, $ 3\cdot\ensuremath{\mathtt{front.size()}} \ge \ensuremath{\mathtt{back.size()}}$ and $ 3\cdot\ensuremath{\mathtt{back.size()}} \ge \ensuremath{\mathtt{front.size()}}$.

Next we analyze the cost of $ \mathtt{add(i,x)}$, ignoring the cost of the $ \mathtt{balance()}$ operation. If $ \ensuremath{\mathtt{i}}<\ensuremath{\mathtt{front.size()}}$, then $ \mathtt{add(i,x)}$ becomes $ \ensuremath{\mathtt{front.add(front.size()-i-1,x)}}$. Since $ \mathtt{front}$ is an ArrayStack, the cost of this is

$\displaystyle O(\ensuremath{\mathtt{front.size()}}-(\ensuremath{\mathtt{front.size()}}-\ensuremath{\mathtt{i}}-1)+1) = O(\ensuremath{\mathtt{i}}+1) \enspace .$ (2.1)

On the other hand, if $ \ensuremath{\mathtt{i}}\ge\ensuremath{\mathtt{front.size()}}$, then $ \mathtt{add(i,x)}$ becomes $ \ensuremath{\mathtt{back.add(i-front.size(),x)}}$. The cost of this is

$\displaystyle O(\ensuremath{\mathtt{back.size()}}-(\ensuremath{\mathtt{i}}-\ens...
....size()}})+1) = O(\ensuremath{\mathtt{n}}-\ensuremath{\mathtt{i}}+1) \enspace .$ (2.2)

Notice that the first case (2.1) occurs when $ \ensuremath{\mathtt{i}}<\ensuremath{\mathtt{n}}/4$. The second case (2.2) occurs when $ \ensuremath{\mathtt{i}}\ge 3\ensuremath{\mathtt{n}}/4$. When $ \ensuremath{\mathtt{n}}/4\le\ensuremath{\mathtt{i}}<3\ensuremath{\mathtt{n}}/4$, we can't be sure whether the operation affects $ \mathtt{front}$ or $ \mathtt{back}$, but in either case, the operation takes $ O(\ensuremath{\mathtt{n}})=O(\ensuremath{\mathtt{i}})=O(\ensuremath{\mathtt{n}}-\ensuremath{\mathtt{i}})$ time, since $ \ensuremath{\mathtt{i}}\ge \ensuremath{\mathtt{n}}/4$ and $ \ensuremath{\mathtt{n}}-\ensuremath{\mathtt{i}}>
\ensuremath{\mathtt{n}}/4$. Summarizing the situation, we have

   Running time of $\displaystyle \ensuremath{\mathtt{add(i,x)}} \le
\left\{\begin{array}{ll}
O(...
... $\ensuremath{\mathtt{i}} \ge 3\ensuremath{\mathtt{n}}/4$}
\end{array}\right.
$

Thus, the running time of $ \mathtt{add(i,x)}$ (ignoring the cost of the call to $ \mathtt{balance()}$) is $ O(1+\min\{\ensuremath{\mathtt{i}},\ensuremath{\mathtt{n}}-\ensuremath{\mathtt{i}}\})$.

The $ \mathtt{remove(i)}$ operation, and its analysis, is similar to the $ \mathtt{add(i,x)}$ operation.

    T remove(int i) {
        T x;
        if (i < front.size()) {
            x = front.remove(front.size()-i-1);
        } else {
            x = back.remove(i-front.size());
        }
        balance();
        return x;
    }

2.5.1 Balancing

Finally, we study the $ \mathtt{balance()}$ operation performed by $ \mathtt{add(i,x)}$ and $ \mathtt{remove(i)}$. This operation is used to ensure that neither $ \mathtt{front}$ nor $ \mathtt{back}$ gets too big (or too small). It ensures that, unless there are fewer than 2 elements, each of $ \mathtt{front}$ and $ \mathtt{back}$ contain at least $ \ensuremath{\mathtt{n}}/4$ elements. If this is not the case, then it moves elements between them so that $ \mathtt{front}$ and $ \mathtt{back}$ contain exactly $ \lfloor\ensuremath{\mathtt{n}}/2\rfloor$ elements and $ \lceil\ensuremath{\mathtt{n}}/2\rceil$ elements, respectively.

    void balance() {
        int n = size();
        if (3*front.size() < back.size()) {
            int s = n/2 - front.size();
            List<T> l1 = newStack();
            List<T> l2 = newStack();
            l1.addAll(back.subList(0,s));
            Collections.reverse(l1);
            l1.addAll(front);
            l2.addAll(back.subList(s, back.size()));
            front = l1;
            back = l2;
        } else if (3*back.size() < front.size()) {
            int s = front.size() - n/2;
            List<T> l1 = newStack();
            List<T> l2 = newStack();
            l1.addAll(front.subList(s, front.size()));
            l2.addAll(front.subList(0, s));
            Collections.reverse(l2);
            l2.addAll(back);
            front = l1;
            back = l2;
        }
    }

There is not much to analyze. If the $ \mathtt{balance()}$ operation does rebalancing, then it moves $ \Theta(\ensuremath{\mathtt{n}})$ elements and this takes $ O(\ensuremath{\mathtt{n}})$ time. This is bad, since $ \mathtt{balance()}$ is called with each call to $ \mathtt{add(i,x)}$ and $ \mathtt{remove(i)}$. However, the following lemma shows that, on average, $ \mathtt{balance()}$ only spends a constant amount of time per operation.

Lemma 2..2   If an empty DualArrayDeque is created and any sequence of $ m\ge 1$ calls to $ \mathtt{add(i,x)}$ and $ \mathtt{remove(i)}$ are performed, then the total time spent during all calls to $ \mathtt{balance()}$ is $ O(m)$.

Proof. We will show that, if $ \mathtt{balance()}$ is forced to shift elements, then the number of $ \mathtt{add(i,x)}$ and $ \mathtt{remove(i)}$ operations since the last time $ \mathtt{balance()}$ shifted any elements is at least $ \ensuremath{\mathtt{n}}/2-1$. As in the proof of Lemma 2.1, this is sufficient to prove that the total time spent by $ \mathtt{balance()}$ is $ O(m)$.

We will perform our analysis using the potential method. Define the potential of the DualArrayDeque as

$\displaystyle \Phi = \vert\ensuremath{\mathtt{front.size()}} - \ensuremath{\mathtt{back.size()}}\vert \enspace . $

The interesting thing about this potential is that a call to $ \mathtt{add(i,x)}$ or $ \mathtt{remove(i)}$ that does not do any balancing can increase the potential by at most 1.

Observe that, immediately after a call to $ \mathtt{balance()}$ that shifts elements, the potential, $ \Phi_0$, is at most 1, since

$\displaystyle \Phi_0 = \left\vert\lfloor\ensuremath{\mathtt{n}}/2\rfloor-\lceil\ensuremath{\mathtt{n}}/2\rceil\right\vert\le 1 \enspace .$

Consider the situation immediately before a call to $ \mathtt{balance()}$ that shifts elements and suppose, without loss of generality, that $ \mathtt{balance()}$ is shifting elements because $ 3\ensuremath{\mathtt{front.size()}} < \ensuremath{\mathtt{back.size()}}$. Notice that, in this case,

$\displaystyle \ensuremath{\mathtt{n}}$ $\displaystyle =$ $\displaystyle \ensuremath{\mathtt{front.size()}}+\ensuremath{\mathtt{back.size()}}$  
  $\displaystyle <$ $\displaystyle \ensuremath{\mathtt{back.size()}}/3+\ensuremath{\mathtt{back.size()}}$  
  $\displaystyle =$ $\displaystyle \frac{4}{3}\ensuremath{\mathtt{back.size()}}$  

Furthermore, the potential at this point in time is
$\displaystyle \Phi_1$ $\displaystyle =$ $\displaystyle \ensuremath{\mathtt{back.size()}} - \ensuremath{\mathtt{front.size()}}$  
  $\displaystyle >$ $\displaystyle \ensuremath{\mathtt{back.size()}} - \ensuremath{\mathtt{back.size()}}/3$  
  $\displaystyle =$ $\displaystyle \frac{2}{3}\ensuremath{\mathtt{back.size()}}$  
  $\displaystyle >$ $\displaystyle \frac{2}{3}\times\frac{3}{4}\ensuremath{\mathtt{n}}$  
  $\displaystyle =$ $\displaystyle \ensuremath{\mathtt{n}}/2$  

Therefore, the number of calls to $ \mathtt{add(i,x)}$ or $ \mathtt{remove(i)}$ since the last time $ \mathtt{balance()}$ shifted elements is at least $ \Phi_1-\Phi_0
> \ensuremath{\mathtt{n}}/2-1$. This completes the proof. $ \qedsymbol$

2.5.2 Summary

The following theorem summarizes the performance of a DualArrayStack

Theorem 2..4   A DualArrayDeque implements the List interface. Ignoring the cost of calls to $ \mathtt{resize()}$ and $ \mathtt{balance()}$, a DualArrayDeque supports the operations Furthermore, beginning with an empty DualArrayDeque, any sequence of $ m$ $ \mathtt{add(i,x)}$ and $ \mathtt{remove(i)}$ operations results in a total of $ O(m)$ time spent during all calls to $ \mathtt{resize()}$ and $ \mathtt{balance()}$.

opendatastructures.org