SIMULATION AND MODELLING - to simulate a system, we must model its components. - need to model "entities" in the system, relationship between entities, and "events" (state changes of the system). - Simulation types: discrete vs continuous static (time is not an issue) vs dynamic deterministic vs stochastic (probabilistic, random) - examples: Categorize the following: "randomized" or "Monte-Carlo" integration Simulating the trajectory of a projectile Random walks on a grid - for us: dynamic, probabilistic, discrete simulations - we must: - keep track of simulation time - generate and execute various events as time progresses. - two major design philosophies in simulation: - time-driven - event-driven TIME-DRIVEN SIMULATION - a variable holds the current time - increment the variable by a fixed amount at every "timestep" of the simulation. [figure] - after each step, check all possible event types - handle all events that occur during the step - Example: random walk on a two-dimensional grid. - time-driven simulation is ideal because we know that an event (movement) happens at every step. - Stopping condition? Several possibilities: - after a certain time (answers questions like "how far can we get in given time?") - when a specific state is reached, eg., position, or distance from origin. (answers questions like "how long would it take to get as far as this?") initialize( state ) time := startTime while ( time < endTime ) // or state != finalState collect statistics from the current state handle all events that occured in the interval [time, time+timeStep] time := time + timeStep end while - if events can occur at *any* time (rather than just at discrete times), we have a problem - a typical example of this is simulating a line of customers at a bank, cars at a gas station, packets in a network, etc. - if time step is too small, the simulation can take too long time (and nothing happens during most time steps) - if timestep is too big, we have lots of events to deal with at each step. Solution: EVENT-DRIVEN SIMULATION - here, we have a list of all events that occur at various times. - main loop simply "jumps" to the time for the first event and handles it, then jumps to the second one, etc. (figure) - can we simply generate *all* the events for the entire simulation at the beginning and then start processing? - no. Some events *cause* others, so we can't know all events, or their time, without *doing* the actual simulation (eg., departure times) - solution: all events except the initial one(s) are scheduled by previous events. - we need a dynamically changing list, ordered by execution time -> "priority queue" - stopping criteria: - upon reaching a specific time or state - pseudo-events for termination, taking statistics - eg., take statistics every hour - pseudo-code: initialize( system state ); initialize( event-list ); while ( simulation not finished ) remove earliest event from event-list; set time = time of this event; handle event (including possibly scheduling new events); end while - can even start the simulation with a pseudo-event that schedules the first "real" event. - no explicit increment of time variable RANDOM NUMBER GENERATION (BASIC PROBABILITY THEORY) - look at the readings (section on simulation) for lots more detail! - "random variable" = "RV" = value associated with some random event. - eg., rolling a die, random variable is value 1-6. - two dice, RV is value 2-12 - "discrete" random variable X can only have one of finitely many possible values - there is a fixed finite probability p(x) that the variable will have a particular value x, for each x. - the probabilities do *not* have to be the same for different values of the random variable. - eg., when throwing two dice, we expect 6 to appear more frequently than 2 (why?) - "probability distribution": given RV X, a probability distribution is function f_X that assigns probability f_X(x) to each value x - sum_{possible x} f_X(x) = 1 - Some common distributions: - uniform (e.g., when throwing one die) - normal (e.g., when throwing 2 dice, or more) - "continuous" random variable X can have one of infinitely many possible values - probability that it is exactly equal to one single value x is always 0, so we need to talk of the probability of the value lying in some interval. - want to define something similar to the probability distribution: this will be called a "probability density function" f_X. (PDF for short) - f_X represents the probability that the value of X lies within a certain interval [a,b], as follows: /b Pr[ X = x | a <= x <= b ] = | f_X(x) dx. /a - the "sum to 1" constraint is: /oo | f_X(x) dx = 1. /-oo - common distributions - "uniform" distribution over an interval [a,b]: { 1/(b-a) if a <= x <= b, f_X(x) = { { 0 otherwise. - represents a distribution where X has equal probability of having any single value in [a,b]. - The "exponential" distribution is given by f_X(t) = r e^{-rt}, for some real number r > 0, and represents the times between consecutive events when they occur at a uniform rate r over time. (More on this, and on its relationship to "Poisson" distributions, next time.) USING `RANDOM()' IN YOUR PROGRAMS - The function call `random()' returns a pseudo-random long integer in the range 0 .. (RANDOM_MAX - 1). Why "pseudo-random"? Well, computers behave in a totally predictable manner, so they don't really have any natural randomness to use. They generate these pseudo-random numbers according to a well-defined deterministic computation. What this means is that `random' has access to a certain amount of data that it uses to determine the next number to return. And every time that this data has a particular value, random will return the same number. - `srandom( unsigned )' is used to initialize the state of the data that `random' uses to produce these pseudo-random values. This function should be called ONLY ONCE in the entire program, at the very beginning of the program. Whenever srandom is called with a particular "seed" as argument, `random' will generate a particular fixed sequence of numbers. This means that if srandom is called with a fixed value, say `srandom( 25 )', then the program will behave in exactly the same way each time that it is run. To get a program to have more "random" behaviour, we need to use a seed that's a little bit different every time we run the program. A standard trick is to use the value of the "system clock" for this: `srandom( time(0) )' will do the trick (the `time(...)' function is declared in the header , which is *already* included in "random.h"). MORE PROBABILITY THEORY - Recall the continuous probability density functions for the "uniform" and "exponential" distributions: The uniform distribution over an interval [a,b] is given by: { 1/(b-a) if a <= x <= b, f_X(x) = { { 0 otherwise. It represents a distribution where X has equal probability of having any single value in [a,b]. The exponential distribution is given by { r e^{-rt} if t >= 0, f_X(t) = { { 0 otherwise, for some real number r > 0, and represents the times between consecutive events when they occur at a uniform rate r over time, i.e., r is the average number of events that occur per unit time. - Now, we want to be able to generate random numbers with these distributions, and in order to do this, we must talk about one more concept: the "cumulative distribution function" F_X(x), represents the probability that the value of X is *less than or equal to* x, i.e., /x F_X(x) = | f_X(z) dz. /-oo