ICS 65 Fall 2011, Final Project: I Don't Want to Wait

Due date and time: Friday, December 2, 11:59pm

Introduction

All around Orange County, "hand car wash" establishments have been sprouting up in recent years. Customers pay upward of $15 to get their cars washed, waxed, vacuumed, and buffed for them. Before the advent of hand car washes, though, there were self-service car wash centers. (There are still a few of these around, but they're harder to find than they used to be.) At a self-service car wash, you drive into a covered bay filled with the necessities for washing your car yourself. After feeding some money into a machine, you're given a certain amount of time to finish washing your car, then you go on your merry way with a clean car and a sense of accomplishment.

Suppose you've decided, nostalgically, to open a self-service car wash. You've acquired the land already, and it's time to design the establishment. Since you intend to keep your car wash for a long time, you'd like to be sure that it will serve as many customers as possible, even if the cost of building it rises according to your design. Sooner or later, you figure, the profit made from the car wash will outweigh the building costs. By maximizing profits, the break-even point will arrive sooner.

According to your architectural plans, you have enough space on your land for as many as eight bays, though this wouldn't leave you with any parking for people waiting to wash their car. For each bay you leave out, you'll be able to add two parking spots for cars waiting to be washed. Unable to decide what the optimal number of bays and parking spots will be, you decide to put your programming skills to work by writing a simulation. When combined with market research, the simulation will allow you to evaluate possible designs for your car wash.

Overview of the simulation

Customers will arrive at the car wash (in their cars, of course) with one of three possible jobs in mind:

Washing their cars only.
Washing and vacuuming their cars.
Washing, vacuuming, and waxing their cars.

Not everyone will wash, vacuum, and/or wax their car at precisely the same rate; some people work more quickly than others.

Upon arrival, if there is an empty bay available, the customer will pull into that bay and perform the desired job immediately. If all of the bays are full, but there is at least one available parking spot, the customer will pull into the parking spot and wait for an empty bay; when a bay subsequently becomes available, the customer will pull into the bay and do the desired job. The customer that has been in a parking spot the longest will always be the first to claim an empty bay. (We'll assume the customers are all relatively civilized people.) If all of the bays are full and all of the parking spots are full, the customer will leave without doing anything. Naturally, it's this eventuality that you're interested in avoiding, or at least minimizing.

In addition to tracking this activity as it happens, the simulation should also count the number of customers who successfully complete each kind of job, and should also count the number of customers who leave without doing anything.

Configuring your simulation

Since you're not entirely sure about many of the details yet — such as the rate at which customers will approach the car wash, the proportion of these customers that will want to perform each job, and so on — your simulation should take a variety of parameters, so that you can try a variety of combinations. The easiest way to take them is to ask the user to type them into a console-mode user interface. The user interface need not be very complex or very robust, since the program is intended, in theory, for your use (since you're going to be the car wash owner). (On the other hand, the user interface at least needs to be descriptive enough for us to figure it out when we're grading the project.)

The simulation will run at a granularity of one second, meaning that all times are measured in whole numbers of seconds. In every case below, n ± m means a uniform random distribution of values between n - m and n + m, inclusive. So, for example, 30 ± 10 means a uniform random distribution of values between 20 and 40, inclusive. Of course, m ≤ n.

There are 1 ≤ n ≤ 8 bays and 2 * (8 - n) parking spots.
The simulation will run for n seconds total.
Customers will arrive every n ± m seconds.
Washing a car will take n ± m seconds.
Vacuuming a car will take n ± m seconds.
Waxing a car will take n ± m seconds.
n% of the customers will want to wash only, m% of the customers will want to wash and vacuum, and (100 - n - m)% will want to wash, vacuum, and wax.

So, your simulation will need to ask the user to specify all the parameters above. (It should be noted that, even though I've used the variables n and m repeatedly, each rule has its own value for n and m.) For example, you might enter these parameters:

There are 6 bays and 4 parking spots.
The simulation will run for 57,600 seconds.
Customers will arrive every 40 ± 20 seconds.
Washing a car will take 240 ± 60 seconds.
Vacuuming a car will take 180 ± 30 seconds.
Waxing a car will take 300 ± 90 seconds.
30% of the customers will want to wash only, 50% will want to wash and vacuum, and 20% will want to wash, vacuum, and wax.

Event-based simulations

An obvious way to build a simulation like this is to center your implementation around a simulation loop. Each iteration of the loop corresponds to one clock tick (i.e., the minimum meaningful chunk of time in your simulation) and continues until the appropriate number of clock ticks have elapsed. In addition to being simple to implement, this is a nice approach if the simulation is dense with activity, where many events are taking place during every clock tick.

However, consider the example set of parameters above. The simulation will run for 57,600 simulated seconds, yet in very few of those seconds will anything actually happen. Nothing will happen for at least the first 20 seconds, and nothing else will happen for at least 20, and as many as 60, more. So looping over each second, when the majority of the iterations of the loop will do nothing, is tremendously wasteful. In a simulation that runs for s seconds and includes e events, we'd like it to run in much closer to O(e) time than O(s) time — in other words, we'd like the running time of our simulation to be determined by how densely packed with activity it is, rather than by the length of the simulation. Since e is much less than s in the case of our car wash simulation, we should prefer an implementation that does not involve a one-clock-tick-per-iteration simulation loop.

A better solution for sparse simulations like ours is to build an event-based simulation. Instead of looping over every second of simulation time, we'll loop over a sequence of events. This will allow us to seamlessly skip every second that doesn't contain any activity. In a very rough pseudocode form, our simulation loop will look something like this:

    set the current time to 0
    schedule the first customer arrival event

    while (the event schedule is not empty
           and the current time is less than the simulation length)
    {
        get the next event e from the set of events
        set the current time to e's time
        
        if (the current time is less than the simulation length)
        {
            execute e
            schedule any subsequent events implied by e
        }
    }

So, the first question is what data structure we should use to store the event schedule. Initially, it may seem like a queue is a good choice. However, on closer inspection, we find that we need a slightly different approach. Consider this example, based on the set of example parameters in the previous section:

When it comes time to schedule the first customer, we pick a random number between 20 and 60 inclusive (40 ± 20). Let's say we picked 37. So, we schedule a customer arrival event for time 37.
Now we enter the loop and handle the first event. The only event currently scheduled is a customer arrival at time 37. So, we set the current time to 37 and execute the event. Since there is an open bay, the customer proceeds directly into that bay, then begins working. Suppose that the customer wanted only to wash and, choosing a random number in the appropriate range, that customer will require 275 seconds to complete the job. So, we schedule a completion event at time 37 + 275 = 312. Meanwhile, after every customer arrival, we also schedule the next customer arrival, which will take place at, say, time 37 + 41 = 78.
The next event to take place is the second customer arrival at time 78. Again, that customer pulls into a bay and begins work; let's say that this customer wants to wash, vacuum, and wax, and that the randomly-selected total time for that job is 770 seconds. We schedule his completion event at time 78 + 770 = 858. Also, we schedule the next customer arrival, which will occur at time 78 + 25 = 103.
Now we need to select the next event from the schedule, which is actually the last one we added, another customer arrival at time 103, even though there are two other events, a completion event at time 312 and another at time 858, also scheduled.

The simulation proceeds in much the same fashion. What should be obvious from the example above is that we need a data structure that allows us to schedule events in any order, yet have them emerge from the schedule in the appropriate order (i.e., in ascending order in terms of time). This sounds like a job for a priority queue, where the events are the items, the scheduled times are the priorities, and the item with the lowest priority is considered the most important. The requirements for your priority queue implementation are outlined in the next section.

The other thing you may have noticed from the example above is that you'll need to define a set of event types. For each event type, you'll need to decide what information needs to be stored and what actions should be taken when the event is executed. Notice how customer arrival events were designed in the previous example:

A customer arrival event, like all events, is scheduled for a particular time.
It contains a desired job for the new customer — wash, wash and vacuum, or wash, vacuum, and wax.
When executing a customer arrival event...
- ...we first see if there's an open bay. If so, the customer enters that bay and begins working. A completion event is scheduled for the completion time.
- ...if there is no open bay, we see if there's an open parking spot. If so, the customer parks and waits.
- ...if there is neither an open bay nor an open parking spot, the customer leaves.
- ...regardless of what happens with this customer, the next customer arrival is also scheduled.

You'll need to decide what types of events you'll need, what parameters they'll carry with them, and what it will mean to execute them.

Implementing your priority queue as a template class

You are required to implement your priority queue as a template class. At minimum, your template class should take one type parameter, specifying the type for the items. You may assume in your template class that the specified type has a < operator and an == operator defined for it, though you should be sure to document this assumption, as well as any others you're making, about the type parameter.

Priority queues can be implemented in a straightforward way, just as queues can, though the resulting implementation of a priority queue is very inefficient compared to a similarly-implemented queue. By storing the items in a sequential data structure, such as an array, vector, or linked list, your implementation will feature either O(n) enqueues or O(n) dequeues (or, at worst, both), whether you keep the items sorted in order of their arrival or their priority.

Assuming that you've taken a data structures course (e.g., ICS 23) in the past, you'll recall that there is a better technique for implementing a priority queue, which is to build it as a binary heap. By using a binary heap, you can implement O(log n) enqueues and dequeues, which is a significant improvement, especially when there will be a large number of items in the priority queue.

In the case of this project, there may not be a large number of items stored in the priority queue. However, if you're going to go to the trouble of building a template class, you'd like for it to be usable not only with many different types of items, but also in many different contexts. An implementation with O(log n) versions of enqueue and dequeue will be useful in many contexts in which an O(n) implementation simply won't be good enough. So you're required to implement your priority queue as a binary heap. Remember that a straightforward way to implement a binary heap is to store the items in an array (or, better yet, a vector, since you can't be sure ahead of time how many elements will be stored in the priority queue), where index 1 stores the root, and for each index i, index 2i stores the left child of i, index 2i + 1 stores the right child, and index floor(i / 2) stores the parent. Other details, including O(log n) algorithms for enqueue and dequeue, can be found in any introductory data structures text or in many places online.

Your priority queue template class may be hard-coded as a binary min heap, meaning that the minimum-priority item will always be considered the most important, and will thus be stored at the root. If you prefer, there are ways to generalize this behavior, so that your priority queue may consider either the highest or the lowest priority to be the most important, depending on a template parameter. I'd be happy to discuss these techniques with you if you'd like to explore them.

Using polymorphism to your advantage

I highly suggest the following design sketch for your simulation.

Create an Event class. Include within it a member variable for the event's scheduled time, and overload comparison operators such as < and ==, so that your priority queue can compare two events using them.
Create an abstract EventHandler class. Include a pure virtual execute( ) method within it.
Derive a class from EventHandler for each kind of event supported by your simulation. Implement the execute( ) method as appropriate.
Each Event object should contain an EventHandler*, pointing to a handler of the appropriate type. So, for example, a customer arrival event will be contained in an Event object with the event handler pointer pointing to an event handler that knows how to handle customer arrivals.
In your priority queue, store Event objects.

If you follow this approach, your simulation loop will be very simple, needing only to dequeue an event, then call execute( ) on it (which calls execute( ), polymorphically, on its EventHandler).

The reason why Event and EventHandler need to be separated is because we want to be able to use operators like < and == to compare objects within the priority queue, but we also want the priority queue to store events that can be handled polymorphically, with different kinds of events being handled differently depending on their run-time type. One way to attempt this would be to store Event*'s in the priority queue, then have a class derived from Event for each kind of event; the problem with this approach is that the comparison operators on pointers compare the addresses the pointers point to, not the objects!

The design I suggest above provides you with the best of both worlds: Event objects that can be compared with overloaded operators, but polymorphic event handlers that can, at run-time, automatically handle the appropriate type of event in the appropriate way.

Output of your simulation

Your simulation program should generate the following output:

Each time something happens in the simulation (e.g., a car moves into a bay, a car moves into a parking spot), print a message to cout specifying what the current time is and what happened.
At the conclusion of your simulation, show the following overall statistics:
- The total number of customers who arrived at the car wash, including those who left without doing anything.
- The number of customers who completed each kind of job (wash; wash and vacuum; wash, wax, and vacuum).
- The number of customers still in bays when the simulation ended.
- The number of customers still in parking spots when the simulation ended.
- The number of customers who left without doing anything because all bays and parking spots were full when they arrived.

Deliverables

You must submit all of your source and header files (.cpp and .h). Please do not submit a project file or other files generated by your development environment.

Limitations

You must build your own PriorityQueue template class — as opposed to the priority_queue adapter template from the Standard C++ Library — though you may use vector as the underlying data structure, if you'd like.