Simulating complex real-world systems is now possible with SimPy , an open source simulation package. SimPy, originally developed by the authors of this article, has been developed to production quality by a small team of enthusiastic open sourcerers around the world. As far as we know, it is the only existing discrete event Python simulation package. Actually, it is one of a very small number of fully object oriented simulation systems. This article is written with the explicit goal of whetting the appetite of simulation newbies to play with this powerful problem solving technique and to get even more users and developers for SimPy.
Why consider simulation? Simulation is a powerful tool for studying communication and computer systems among many other application areas. It is used in studying systems where events--such as the arrival of a message--can occur at random and where resources are limited giving the possibility of congestion and queues.
To demonstrate using SimPy for simulation we will study a simple scenario. The hypothetical SimPyCo software firm has a call center with a helpdesk. The call center has a PABX with a limited number of lines and the helpdesk itself has staff to talk to customers. The runaway sucess of their YpMis product has overwhelmed the help desk resources. Many customers are unhappy because it takes so long to get through to a helper. SimPyCo management must decide on measures to increase the helpdesk capacity. Should they increase the number of staff, the number of lines, both? They decide to use simulation to estimate how changing the numbers of lines or staff would change customer delays. We will use a constant call rate for this example, but the rate will vary in the real world, with busy and quiet periods.
When customers call they may have to wait for a free line. When a line is obtained, they hear the phone ringing and then may have to wait again for a staff member to answer, listening to "comforting" music. When the call is answered they explain their problem and get some sort of result. This also takes time. We model this last period as a fixed known time but in a real simulation it would be uncertain, modelled as a random variable. When the call finishes, the staff member and the line both become free for the next call.
To model "quality of service", we label a customer "happy" or "unhappy" depending on wait time. We will not attempt to model the the quality of the advice he receives. We will record the number of happy and unhappy customers along with their delays.
Setting up a simulation requires modeling the system under study,
capturing the essential entities and their interactions, and then
writing code to make the model executable. The more descriptive a
simulation language is, the easier this task can be
accomplished. SimPy builds on the expressive power of Python, with its
wonderful OO components. Python, however, lacked the capability to
simulate parallel processes efficiently and platform-independently.
Threads are just too heavy and cannot be used to implement the large
number of independent entities in complex systems. Luckily Python 2.2
introduced generators with which one can construct lightweight
coprocesses, objects which can execute in quasi-parallel. (See the
Charming Python article, Iterators
and simple generators.) SimPy uses generators to implement entity
processes. Generators are indicated by the
Simulated entities in SimPy are objects of the
class. Each contains a method which is a generator and controls its
actions. For example, an entity might be a message, with a method
that describes its dynamic behaviour, its arrival, its waiting for
service by nodes of a computer network, and its final departure.
To cause a process to suffer a delay between events, such as the time taken by a computer node to serve a message, the message method would contain a command such as:
t is the time taken for it to be served. Of
course, this can be randomly generated. An example is given later.
Entities do not always control their own processing time since they also suffer delays in waiting caused by congestion and queues. Congestion occurs when several messages need to use the same link or the same computers.
SimPy handles this by modeling resources, that is, facilities that can handle only a few entities at a time; for example, a communication line that can only handle one message at a time. Entities that arrive when the resource is busy must wait until a unit is available. Like us, they wait in a queue and, usually, the first one in the queue is served when a unit becomes free. As in real life, different entities might have different priorities and might even preempt a previous entity. To request a unit of a resource such as a line the message method would contain a command such as:
It would then either be assigned a line immediately or wait until one is available. This queueing process is handled automatically. Once the message has acquired the line, it holds it until it is ready to release it using:
and the line becomes available for use by another waiting message.
This type of simulation can apply to various situations:
Figure 1 -- a diagram of the entities in our helpdesk model
Figure 1 shows the entities in SimPyCo's real world: customers, the helpdesk, the PABX, and the staff. This can be modeled by an active customer entity, an active customer arrival process, a shared resource "PABX" with multiple lines for which customers may have to queue, and another shared resource, "Staff". The load on this system is the arrival rate, and the service rate depends on the customers' service (or time) needs, the number of PABX lines, and the number of staff. The observable system output to be established is the percentage of unhappy customers (depending on total waiting time and customers' tolerance level).
Mapping model entities to SimPy entities is easy (see the bottom of
Fig. 1): customers and customer traffic are programmed as subclasses
Process class, and the shared resources PABX
and Staff as instances of SimPy's
In the program listing, line 1 imports the generators (from the
__future__ until Python 2.3 arrives) and the statement
starting online 2 imports the necessary SimPy simulation
routines. (Usually we import
* rather than this long
list.) The following line imports the standard Python random number
routines, needed for random arrivals.
1 from __future__ import generators 2 from SimPy.Simulation import Process,Resource, \ 3 now,initialize,activate,simulate, \ 4 hold,request,release 5 from random import Random,expovariate 6
Our active elements are the
established as a SimPy
Process starting in line 7. We set
up a number of Class attributes to count the served customers and
those that are unhappy. The
Process must have an
__init__ method where we link the object to the SimPy
scheduling system (line 15). Each is given a unique identifying name
7 class Customer(Process): 8 """Represents customer requiring service""" 9 customers=0 10 customersServed=0 11 customersHappy=0 12 toomuch=0.25 13 14 def __init__(self): 15 Process.__init__(self) 16 Customer.customers += 1 17 self.id="Customer " + str(Customer.customers) 18 19 def happy(self,waitTime): 20 """Is customer happy after service?""" 21 if waitTime < Customer.toomuch: 22 return "happy" 23 else: 24 return "unhappy" 25 26 def servTime(self): 27 return 0.1 # hr 28
serviceNeed method describes the activities making
up the life of the
Customer (line 29). You can see that
this is a Python generator by the presence of the
statements. Each of these may cause a simulated delay to the
customer. There may or may not be a delay, depending on the state of
the system at the (simulated) time of the call. Note that whenever we
say "time" we mean simulation time, not real time.
Line 30 stores the current time--the arrival time--in the local
arrive, so that waiting time can be calculated
29 def serviceNeed(self): 30 arrive=now() 31 yield request,self,phone # get a helpdesk line 32 yield request,self,staff # get a staff member 33 atlast=now() 34 yield hold,self,self.servTime() # get service 35 yield release,self,staff # customer done; free staff member 36 yield release,self,phone # hang up line 37 Customer.customersServed += 1 38 waited=atlast-arrive 39 if self.happy(waited)=="happy" : Customer.customersHappy += 1 40 print "%s %s leaves. Total wait time=%s. Customer %s"\ 41 %(now(),self.id,waited,self.happy(waited)) 42
The customer requests a phone line in the first
statement (line 31). If one is available the next action is executed
without delay. Otherwise the customer is automatically queued with
any others waiting for a line. Here we assume a first-come
first-served queue. Next, the customer requests a staff member (line
32). He is queued again if none are available, though he still holds
the line. The interaction begins after the customer connects to a
staffer. This is modeled as a simple delay (line 34). Before talking
starts we record the current time in variable
When finished with the staff member, the customer then
releases and hangs up in the two
release statements in lines 35 and 36. The released staff
member can then pick up the customer waiting at the head of the
queue. Similarly, the released phone line is immediately available for
The next few lines are housekeeping. We calculate the delay experienced (line 38), determine whether the customer is "happy" or not, incrementing the count of "happy" customers and and print out a trace line. (In practice this trace would be used only in debugging or to feed a statistical program for further analysis.) Then the customer leaves the system and the corresponding object vanishes.
Where do customers come from? They must arrive at random. Experienced simulators use a trick to simulate random events. The time between random arrivals is uncertain with an exponential probability distribution. All we need to do is to generate a sample from that distribution and to allow the Process to hold for that time. Each time we go round the loop a different number is generated and a different delay occurs. A statistician testing the resulting sequence of arrival times would be unable to detect any difference from pure randomness. We use that trick here.
An object of class
Traffic generates a customer
sequence (line 43). This is another SimPy
Process and the
active generator method is called
generate (on line
43 class Traffic(Process): 44 """Generates Customer arrivals""" 45 Rate = 40.0 # (per hour) arrival rate 46 rv = Random() # a random variable 47 48 def __init__(self): 49 Process.__init__(self) 50 51 def generate(self): 52 while True: 53 cust=Customer() # new customer call 54 activate(cust,cust.serviceNeed()) 55 timeToNext = Traffic.rv.expovariate(Traffic.Rate) 56 yield hold,self,timeToNext 57
Rate of arrivals is fixed on line 45. In a
practical example this would be obtained from real data or from
estimates and would most likely be read in. It may be a function of
time of day as well. Line 46 establishes
rv, a random
We create a sequence of customers in an infinite loop. A customer
is created in line 53 and the
activated in line 54. That customer acts independently from this point
on. The time to the next arrival is generated by a sample from the
correct distribution (line 55) and the
simulates the delay. The required random number routines were imported
from the standard Python module in line 5.
Finally we get to the main script that runs the simulation. Line 58
initializes the simulation system. Among other tasks this sets up the
event scheduling system which will run the simulation clock and make
sure events occur in the correct sequence. The next two lines
establish the two Resources, the three
phone lines, and
staff. We assume that calls have no priority
scheme and that queues have a normal first-come, first-served
Line 61 establishes the
traffic object which will
generate our randomly arriving customers. The next line activates
No simulation starts before line 63. The simulation will now run until time 6*24, that is, 24 hours a day for 6 days, at which point it will stop, and the next statement, printing out the results, is executed.
58 initialize() 59 phone = Resource(capacity=3) 60 staff = Resource(capacity=3) 61 traffic = Traffic() 62 activate(traffic, traffic.generate()) 63 simulate(until=6*24) # hours 64 print "Total arrivals: %s / Total served: %s / Total unhappy: %s"\ 65 %(Customer.customers,Customer.customersServed,\ 66 Customer.customersServed - Customer.customersHappy)
So what do we do with this model? The original request was to discover the effect of changing the number of lines or the number of staff on customer happiness. To do this we must vary these numbers through a series of runs. We have to be careful about these experiments. Although using random numbers to generate the arrivals makes the model more realistic (randomness usually leads to queues), it means that a single run cannot be believed by itself. You have to make a series of replications to get a better idea of the result. Of course we can easily extend the displayed program to carry out series of runs, though such extensions are not shown here.
The SimPyCo helpdesk currently has a PABX with 3 incoming lines and 3 staff present. The helpdesk hours are Monday through Saturday, 24 hours a day. Plugging these figures into the simulation program shows a very high percentage of unhappy customers (99%) over the six day week.
To find out how many staff and lines are needed, we now run the simulation for 4 staff/4 lines and 5 staff/5 lines. The results are 86% and 0% unhappy customers, respectively. We repeat the run and now get 78% and 0%. So, as in the real world, the simulation shows variation from week to week, due to the randomness in the arrival of customer calls. The more often one runs the simulation, the better one can estimate the mean levels of unhappy customers.
This program uses the Open Source VPython plotting package (http://vpython.org). Interfacing to VPython and other packages will be explained in a future article on SimPy. If you can't wait for that, download VPython and play with the program, it is simple enough. Enjoy!
There is another real world issue we can address with the simulation: does the level of unhappy customers change over the work week? To analyse the time behavior, we extended the program with graphical output.
We ran the program for the same 3/3, 4/4 and 5/5 options, for 5 simulated weeks. The graphs produced clearly show that in all 5 weeks,
the helpdesk with 3 lines/3 staff is rapidly saturated and stays saturated for the whole week,
Figure 2 -- three lines and three staff members over five weeks
with 4 lines/4 staff, it shows great variability over time in delay times suffered by customers, with significant differences between weeks-- a highly unstable, unmanageable, unpredictable configuration; and
Figure 3 -- four lines and four staff members over five weeks
with 5 lines/5 staff, only in one week out of the five simulated is there a period with any unhappy customer.
Figure 4 -- five lines and five staff members over five weeks
The conclusion for SimPyCo management is that they must increase both the number of lines and the number of manned desks to at least 5 to cope with the current customer arrival rate of 40 calls per hour.
With its constructs
Monitor (which we have not time to explain here),
SimPy is a powerful but easy to use LEGO-like toolbox for describing
and simulating many classes of complex dynamic systems, whether or not
they yet exist. This power is greatly enhanced by the object
orientation of Python, allowing easy extension by subclassing and
inheriting. This way, generic models of frequently occurring
situations ("the production line model", "the service counter model",
"the network model", etc.) can be built and then used with little
additional code for specific systems. Python gives further power to
the SimPy user by being so richly endowed with interfaces to a wide
variety of packages (many of which are open source), such as graphics,
data bases, spreadsheets, and statistics.
We hope this article gave you a taste of the power and ease of use of SimPy. There are many facets of SimPy we have not touched on. how is SimPy using generators? How does one build OO models? How does one collect and analyze SimPy run data efficiently and elegantly? We will cover some of these in a future article. In the meantime, why not download SimPy, run its tutorial and test programs, and build your own simulation model? It is fun and might be profitable!
SimPy site: Download SimPy from here, including:
Klaus Müller is an Open Source software developer and a freelance consultant on military information systems.
Tony Vignaux does some teaching in simulation and continues his research interests.
Return to Python DevCenter.
Copyright © 2009 O'Reilly Media, Inc.