ICS 46 Spring 2022
Notes and Examples: Stacks, Queues, and Deques
Identifying the right data structure
In previous coursework, you've no doubt used at least one programming language — C++ or something else — enough that you've been exposed to the data structures that are provided by the language or its standard library. For example, C++ provides a number of data structures: arrays, std::vector, std::list, and std::map, to name a few. Python, meanwhile, provides choices like lists, sets, tuples, and dictionaries. None of these is definitively better than any of the others; the objective is to choose the data structure that best matches the "shape" of the problem you're trying to solve. If you want to store a sequence of values in Python, a list is a good choice. If you want to keep track of information about students, each indexed uniquely by an ID, in C++, you might choose a std::map, with the keys being the IDs and the values being objects of a Student class.
Generally, then, when we learn about a new data structure we've not seen before, there are a few things we want to understand about it:
If we can get a handle on all of those issues, we'll have a good understanding of when our new data structure might solve a real problem we have, as well as how we should build it (if our preferred programming language doesn't provide one), and how well it will perform in the ways that we plan to use it.
To get our minds around this kind of analysis, let's consider a few of the classic linear data structures, which let you store and manipulate sequences of elements, with the understanding that certain ones, at any given time, are more important than the others.
Stacks
A stack is a data structure that stores a sequence of elements, with the key property being that the most important one, at any given time, is the one that was stored the most recently. If we were to draw a diagram of a stack, which focused on the key concepts, it might look something like this:
The diagram shows the most important conceptual ideas underlying stacks:
In a practical implementation, you might find other operations, like one that tells you how many elements are in the stack, or one that tells you whether the stack is empty, but push, pop, and top are the most important ones, so we'll focus on those.
If we have a stack-shaped problem and we need to implement one, we have a sequence of elements to store; our first order of business is choosing an underlying data structure in which to store them. There are two obvious choices: an array (or an array-based structure, like a std::vector) and a linked list.
Implementing a stack using an array or std::vector
Let's first consider how you might implement a stack using an array-based data structure, such as an array or a std::vector. It seems sensible that we'd want to store the elements in the order in which they appear in the stack, though an open question is whether they should be stored in top-to-bottom or bottom-to-top order. Let's think about each option.
If we store the elements in top-to-bottom order, that means the top of the stack would always be stored in cell 0 of the array, the element just beneath the top would be stored in cell 1, and so on. We'd presumably also need to keep track of how many elements were stored (necessary if we had an array, done automatically by std::vector if we use it). This approach has the upside that we always know where the top element is — it'll always be in cell 0, no matter what. To understand whether this is a good approach, though, we should do some analysis; we'll analyze the three common operations, assuming there are n elements stored in the stack already.
The fundamental problem is the requirement that the topmost element is always in cell 0; the "business end" of the stack being in a fixed position requires us to slide elements back and forth. But if we flip the elements around, instead storing the bottommost element in cell 0, we could always easily infer the position of the top element by looking at the stack's size (which we'd be tracking separately). Is this better? Let's take a look, again assuming there are n elements stored in the stack.
So, the simple act of reversing the order of the elements made a huge difference! This is why it's so important for us to do a little bit of analysis, because we quickly discover that reasonable-sounding ideas ("If the top of the stack was in cell 0, we'd always be able to find it!") sometimes lead to sub-par solutions.
Implementing a stack using a linked list
If we instead opt to implement a stack using a linked list, our first order of business is selecting what kind of linked list we should use. We saw previously that there are variants on the idea of a linked list. The key question is this: What is the simplest kind of linked list that still yields the best possible performance for the common operations?
So, let's start with the simplest kind of linked list we know: a singly-linked list with head. What we've seen is that certain operations, notably the ones that work near the front of the list, are performant; others less so. So if we could make sure that we were only ever modifying the front of the list, we'd be in good shape.
That turns out to be an easy problem to solve, by arranging the elements so that the top of the stack is in the first node of the list, with the others stored sequentially afterward. If there are n elements in our stack, we'd see the following performance characteristics.
The moral of the story is that the simplest kind of linked list would be sufficient to solve this problem reasonably well. That isn't to say that we couldn't use a more complex one — if, for example, we already had one implemented and tested — but knowing the minimum requirement tells us what we would have to build, at minimum, if we had to build it ourselves, or if we could select from many different prebuilt implementations.
Queues
A queue is a data structure that stores a sequence of elements, with the key property that the most important one, at any given time, is the one that was stored the least recently. Elements "wait in line" in the order they arrive. If we were to draw a diagram of a queue, which focused on the key concepts, it might look something like this:
The diagram shows the most important conceptual ideas underlying queues:
As with stacks, a practical queue implementation might include other operations, such as the ability to determine how many elements are stored, but enqueue, dequeue, and front are the most important ones, so we'll focus our attention on those.
If we've got a queue-shaped problem and need to implement one ourselves, we'll need to map the concept to an implementation technique; importantly, we'll need to choose an underlying data structure in which to store the elements, and we'll need to arrange the elements in that data structure to provide performant operations.
Implementing a queue using a linked list
As we did when we were considering how to implement a stack, we should decide which is the simplest kind of linked list that will lead to a favorable outcome; ideally, we could find a way to implement all three of the common operations so they could run in Θ(1) time.
Your intuition should kick in quickly when you consider whether a singly-linked list with head would provide a good solution to this problem. Queues have two "business ends"; we enqueue at the back and dequeue from the front. So, without thinking very hard about it, we can eliminate a singly-linked list with head from consideration, because it doesn't provide constant-time access to both ends of the list, which means that either enqueues or dequeues will be more expensive than they ought to be.
A singly-linked list with head and tail, on the other hand, at least seems like it has a chance at providing Θ(1) enqueue, dequeue, and front operations, because it at least gives us constant-time access to both ends of the list. But we have to think through what we can and can't do in Θ(1) time, given a singly-linked list with head and tail:
Notably absent is the ability to remove from the end of the list, which requires Θ(n) time in a list with n nodes. So we'd best avoid that operation. Nonetheless, if we choose our organization of elements in nodes carefully, we can still achieve the outcome we want. If we store the front element of the queue in the first node, the back element of the queue in the last node, and the remaining nodes (in the proper order) in between, then we're in business.
Implementing a queue using an array or std::vector
We should consider, too, how you might use an array-based data structure, like an array or a std::vector, to implement a queue. Perhaps surprisingly, this is more complicated than it sounds, so we'll need to exercise some caution and think through some of the details carefully.
First of all, we can eliminate std::vector from consideration almost immediately. Used the way it's intended to be used, a std::vector stores its s elements into an underlying array of capacity c, where c ≥ s. If you ask it for its size, you'll get back s; if you ask for its capacity, you'll get back c. And the s elements will always be stored in the first s cells of the underlying array, the first one in cell 0, the second in cell 1, and so on.
So why does this eliminate std::vector as a good choice for solving this problem? Consider how you might solve it.
Regardless of which of these you choose, you'll either have expensive enqueues or expensive dequeues:
While there are ways we could "abuse" std::vector to solve our problem, if we use it the way it's intended, the constraint that the s elements must be stored in the first s cells of the underlying array leads us to an impasse; there's simply no good solution to the problem using this tool.
If, instead, we consider arrays more generally, a somewhat different solution emerges. What if, instead of forcing the front element to be stored in cell 0, we used an array, allowing us to relax that requirement? In that case, we could keep track of the index of the front element of the queue, as well as the index where an element would be added to the back of the queue, and let those "float" over time as elements are enqueued and dequeued. For example, the array-based queue below contains the three elements x, y, and z:
0 | 1 | 2 | 3 | 4 | 5 | 6 |
x | y | z | ||||
↑ f |
↑ b |
In the diagram, f depicts the index of the front element and b depicts the index where a new element would be added to the back; you'd store these values as unsigned integers. Given this scenario, the portion of the array that contains elements actually in the queue begins with the element where f refers and ends with the element preceding where b refers.
Enqueuing an element is simple, then: Store the element in the cell where b refers, then add 1 to b. Dequeuing is similar: Clear the value in the cell where f refers, then add 1 to f. Since arrays provide constant-time access to a cell, given its index, these will be Θ(1) time operations. So far so good!
However, there are some wrinkles worth considering. Given this implementation, both f and b will move forward over time. So, eventually, you could end up in a situation like this one:
0 | 1 | 2 | 3 | 4 | 5 | 6 |
w | v | t | s | |||
↑ f |
↑ b |
Now suppose you want to enqueue another element. What do we do with b? If we add 1 to it, it'll be pointing beyond the boundaries of the array, so the next enqueue will fail. So, instead, we "wrap" it back around to the beginning of the array:
0 | 1 | 2 | 3 | 4 | 5 | 6 |
w | v | t | s | r | ||
↑ b |
↑ f |
And we slightly redefine our idea of which elements are actually part of the queue, so that if b refers to a cell with a smaller index than f, everything from f through the end of the array and anything in a cell preceding b will be considered part of the queue. Just as b can wrap around when it reaches the end and we enqueue an element, f can wrap around when it reaches the end and we dequeue one. Over time, f and b chase each other around the array forever, and it becomes possible to enqueue and dequeue elements indefinitely, as long as we never dequeue from an empty queue or enqueue into a full one.
There's one last wrinkle. How do we detect that the array is full? Let's consider what it looks like just before it fills.
0 | 1 | 2 | 3 | 4 | 5 | 6 |
q | w | v | t | s | r | |
↑ b |
↑ f |
So what would happen if we enqueued one more element? We'd now be in the following situation.
0 | 1 | 2 | 3 | 4 | 5 | 6 |
q | p | w | v | t | s | r |
↑↑ f b |
This suggests that we can detect "fullness" when f = b. But what about emptiness? Just before the queue is empty, it would look like this:
0 | 1 | 2 | 3 | 4 | 5 | 6 |
q | ||||||
↑ f |
↑ b |
If we dequeued an element, we would end up here.
0 | 1 | 2 | 3 | 4 | 5 | 6 |
↑↑ f b |
And, suddenly, it becomes evident that f = b might also mean that the queue is empty. So how do we tell the difference? The answer lies in one simple tweak: We also keep track of the size of the queue at all times (i.e., the number of elements we're currently storing in the queue), incrementing and decrementing it as we enqueue and dequeue elements. This gives us not only a simple solution to the problem of finding out how many elements we're storing, but it forms the basis of how we detect fullness and emptiness, by simply comparing size to the capacity of the array (which, in C++, we would also need to track separately).
This solution, generally, is called the circular array implementation of a queue, and is common in both software and hardware, particularly when there is a fixed amount of memory available.
Deques (Double-Ended Queues)
I expect there's a good chance you've seen stacks and queues before, but there is another, slightly lesser-known "classic" linear data structure that you may never have seen, which is called a deque, an oddly-spelled name that is intended to be shorthand for double-ended queue. The name "double-ended queue" is a pretty accurate description of what it is; it acts like a queue in which you can treat either end as the front and either end as the back, leading to the following common operations.
While deques don't themselves fit as many "real-world" problems as stacks and queues, one very useful application of deques is as an underlying data structure on top of which others (notably, stacks and queues) can be built. For example, the C++ Standard Library contains a std::deque class template for exactly this reason.
Implementing a deque using a linked list
Because a deque requires us to be able to add and remove elements at either end of the list, we'll need a doubly-linked list with head and tail to do so efficiently, because that's the only linked list variant we've seen that allows us to remove from either end in Θ(1) time.
Implementing a deque using an array or std::vector
Like queues, deques can't be effectively implemented using std::vector as it was intended to be used, because one end of the deque or the other would be required to be in cell 0 at all times.
However, we could do something very similar to the circular array implementation of a queue. The only difference is that both f and b would be able to move in either direction, because both enqueuing and dequeing are possible at both ends of the queue. Otherwise, the implementation would be very similar to the one we saw previously.