ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


SimPy: Simulating Systems in Python

by Klaus Müller and Tony Vignaux
02/27/2003

Simulating complex real-world systems is now possible with SimPy , an open source simulation package. SimPy, originally developed by the authors of this article, has been developed to production quality by a small team of enthusiastic open sourcerers around the world. As far as we know, it is the only existing discrete event Python simulation package. Actually, it is one of a very small number of fully object oriented simulation systems. This article is written with the explicit goal of whetting the appetite of simulation newbies to play with this powerful problem solving technique and to get even more users and developers for SimPy.

Why consider simulation? Simulation is a powerful tool for studying communication and computer systems among many other application areas. It is used in studying systems where events--such as the arrival of a message--can occur at random and where resources are limited giving the possibility of congestion and queues.

To demonstrate using SimPy for simulation we will study a simple scenario. The hypothetical SimPyCo software firm has a call center with a helpdesk. The call center has a PABX with a limited number of lines and the helpdesk itself has staff to talk to customers. The runaway sucess of their YpMis product has overwhelmed the help desk resources. Many customers are unhappy because it takes so long to get through to a helper. SimPyCo management must decide on measures to increase the helpdesk capacity. Should they increase the number of staff, the number of lines, both? They decide to use simulation to estimate how changing the numbers of lines or staff would change customer delays. We will use a constant call rate for this example, but the rate will vary in the real world, with busy and quiet periods.

When customers call they may have to wait for a free line. When a line is obtained, they hear the phone ringing and then may have to wait again for a staff member to answer, listening to "comforting" music. When the call is answered they explain their problem and get some sort of result. This also takes time. We model this last period as a fixed known time but in a real simulation it would be uncertain, modelled as a random variable. When the call finishes, the staff member and the line both become free for the next call.

To model "quality of service", we label a customer "happy" or "unhappy" depending on wait time. We will not attempt to model the the quality of the advice he receives. We will record the number of happy and unhappy customers along with their delays.

Related Reading

Python Cookbook
By David Ascher, Matt Margolin, Alex Martelli

SimPy

Setting up a simulation requires modeling the system under study, capturing the essential entities and their interactions, and then writing code to make the model executable. The more descriptive a simulation language is, the easier this task can be accomplished. SimPy builds on the expressive power of Python, with its wonderful OO components. Python, however, lacked the capability to simulate parallel processes efficiently and platform-independently. Threads are just too heavy and cannot be used to implement the large number of independent entities in complex systems. Luckily Python 2.2 introduced generators with which one can construct lightweight coprocesses, objects which can execute in quasi-parallel. (See the Charming Python article, Iterators and simple generators.) SimPy uses generators to implement entity processes. Generators are indicated by the yield statements.

Simulated entities in SimPy are objects of the Process class. Each contains a method which is a generator and controls its actions. For example, an entity might be a message, with a method that describes its dynamic behaviour, its arrival, its waiting for service by nodes of a computer network, and its final departure.

To cause a process to suffer a delay between events, such as the time taken by a computer node to serve a message, the message method would contain a command such as:

yield hold,self,t

where t is the time taken for it to be served. Of course, this can be randomly generated. An example is given later.

Entities do not always control their own processing time since they also suffer delays in waiting caused by congestion and queues. Congestion occurs when several messages need to use the same link or the same computers.

SimPy handles this by modeling resources, that is, facilities that can handle only a few entities at a time; for example, a communication line that can only handle one message at a time. Entities that arrive when the resource is busy must wait until a unit is available. Like us, they wait in a queue and, usually, the first one in the queue is served when a unit becomes free. As in real life, different entities might have different priorities and might even preempt a previous entity. To request a unit of a resource such as a line the message method would contain a command such as:

yield request,self,line

It would then either be assigned a line immediately or wait until one is available. This queueing process is handled automatically. Once the message has acquired the line, it holds it until it is ready to release it using:

yield release,self,line

and the line becomes available for use by another waiting message.

This type of simulation can apply to various situations:

The Helpdesk Model


Figure 1 -- a diagram of the entities in our helpdesk model

Figure 1 shows the entities in SimPyCo's real world: customers, the helpdesk, the PABX, and the staff. This can be modeled by an active customer entity, an active customer arrival process, a shared resource "PABX" with multiple lines for which customers may have to queue, and another shared resource, "Staff". The load on this system is the arrival rate, and the service rate depends on the customers' service (or time) needs, the number of PABX lines, and the number of staff. The observable system output to be established is the percentage of unhappy customers (depending on total waiting time and customers' tolerance level).

Mapping model entities to SimPy entities is easy (see the bottom of Fig. 1): customers and customer traffic are programmed as subclasses of SimPy's Process class, and the shared resources PABX and Staff as instances of SimPy's Resource class.

In the program listing, line 1 imports the generators (from the __future__ until Python 2.3 arrives) and the statement starting online 2 imports the necessary SimPy simulation routines. (Usually we import * rather than this long list.) The following line imports the standard Python random number routines, needed for random arrivals.

  1 from __future__ import generators                
  2 from SimPy.Simulation import Process,Resource,  \
  3      now,initialize,activate,simulate,  \
  4      hold,request,release                        
  5 from random import Random,expovariate            
  6

Our active elements are the Customer class, established as a SimPy Process starting in line 7. We set up a number of Class attributes to count the served customers and those that are unhappy. The Process must have an __init__ method where we link the object to the SimPy scheduling system (line 15). Each is given a unique identifying name (line 17).

  7 class Customer(Process):                         
  8     """Represents customer requiring service""" 
  9     customers=0
 10     customersServed=0
 11     customersHappy=0
 12     toomuch=0.25
 13 
 14     def __init__(self):                          
 15         Process.__init__(self)                   
 16         Customer.customers += 1                  
 17         self.id="Customer " + str(Customer.customers)
 18  
 19     def happy(self,waitTime):                    
 20         """Is customer happy after service?"""
 21         if waitTime < Customer.toomuch:
 22             return "happy"
 23         else:
 24             return "unhappy"
 25 
 26     def servTime(self):
 27         return 0.1      # hr
 28

The serviceNeed method describes the activities making up the life of the Customer (line 29). You can see that this is a Python generator by the presence of the yield statements. Each of these may cause a simulated delay to the customer. There may or may not be a delay, depending on the state of the system at the (simulated) time of the call. Note that whenever we say "time" we mean simulation time, not real time.

Line 30 stores the current time--the arrival time--in the local variable arrive, so that waiting time can be calculated later.

 29     def serviceNeed(self):                        
 30         arrive=now()                              
 31         yield request,self,phone   # get a helpdesk line  
 32         yield request,self,staff   # get a staff member   
 33         atlast=now()                              
 34         yield hold,self,self.servTime()     # get service  
 35         yield release,self,staff   # customer done; free staff member 
 36         yield release,self,phone   # hang up line 
 37         Customer.customersServed += 1             
 38         waited=atlast-arrive                      
 39         if self.happy(waited)=="happy" : Customer.customersHappy += 1 
 40         print "%s %s leaves. Total wait time=%s. Customer %s"\
 41               %(now(),self.id,waited,self.happy(waited))
 42 

The customer requests a phone line in the first yield statement (line 31). If one is available the next action is executed without delay. Otherwise the customer is automatically queued with any others waiting for a line. Here we assume a first-come first-served queue. Next, the customer requests a staff member (line 32). He is queued again if none are available, though he still holds the line. The interaction begins after the customer connects to a staffer. This is modeled as a simple delay (line 34). Before talking starts we record the current time in variable atlast .

When finished with the staff member, the customer then releases and hangs up in the two yield release statements in lines 35 and 36. The released staff member can then pick up the customer waiting at the head of the queue. Similarly, the released phone line is immediately available for another customer.

The next few lines are housekeeping. We calculate the delay experienced (line 38), determine whether the customer is "happy" or not, incrementing the count of "happy" customers and and print out a trace line. (In practice this trace would be used only in debugging or to feed a statistical program for further analysis.) Then the customer leaves the system and the corresponding object vanishes.

Where do customers come from? They must arrive at random. Experienced simulators use a trick to simulate random events. The time between random arrivals is uncertain with an exponential probability distribution. All we need to do is to generate a sample from that distribution and to allow the Process to hold for that time. Each time we go round the loop a different number is generated and a different delay occurs. A statistician testing the resulting sequence of arrival times would be unable to detect any difference from pure randomness. We use that trick here.

An object of class Traffic generates a customer sequence (line 43). This is another SimPy Process and the active generator method is called generate (on line 51).

 43 class Traffic(Process):                          
 44     """Generates Customer arrivals"""
 45     Rate = 40.0   # (per hour) arrival rate      
 46     rv = Random() # a random variable            
 47 
 48     def __init__(self):
 49         Process.__init__(self)
 50 
 51     def generate(self):                          
 52         while True: 
 53             cust=Customer()  # new customer call 
 54             activate(cust,cust.serviceNeed())    
 55             timeToNext = Traffic.rv.expovariate(Traffic.Rate) 
 56             yield hold,self,timeToNext           
 57 

The Rate of arrivals is fixed on line 45. In a practical example this would be obtained from real data or from estimates and would most likely be read in. It may be a function of time of day as well. Line 46 establishes rv, a random generator.

We create a sequence of customers in an infinite loop. A customer is created in line 53 and the serviceNeed method activated in line 54. That customer acts independently from this point on. The time to the next arrival is generated by a sample from the correct distribution (line 55) and the yield hold simulates the delay. The required random number routines were imported from the standard Python module in line 5.

Finally we get to the main script that runs the simulation. Line 58 initializes the simulation system. Among other tasks this sets up the event scheduling system which will run the simulation clock and make sure events occur in the correct sequence. The next two lines establish the two Resources, the three phone lines, and the two staff. We assume that calls have no priority scheme and that queues have a normal first-come, first-served discipline.

Line 61 establishes the traffic object which will generate our randomly arriving customers. The next line activates it.

No simulation starts before line 63. The simulation will now run until time 6*24, that is, 24 hours a day for 6 days, at which point it will stop, and the next statement, printing out the results, is executed.

 58 initialize()                               
 59 phone   = Resource(capacity=3)               
 60 staff   = Resource(capacity=3)               
 61 traffic = Traffic()                          
 62 activate(traffic, traffic.generate())      
 63 simulate(until=6*24)     # hours           
 64 print "Total arrivals: %s / Total served: %s / Total unhappy: %s"\
 65       %(Customer.customers,Customer.customersServed,\
 66         Customer.customersServed - Customer.customersHappy)

So what do we do with this model? The original request was to discover the effect of changing the number of lines or the number of staff on customer happiness. To do this we must vary these numbers through a series of runs. We have to be careful about these experiments. Although using random numbers to generate the arrivals makes the model more realistic (randomness usually leads to queues), it means that a single run cannot be believed by itself. You have to make a series of replications to get a better idea of the result. Of course we can easily extend the displayed program to carry out series of runs, though such extensions are not shown here.

The SimPyCo helpdesk currently has a PABX with 3 incoming lines and 3 staff present. The helpdesk hours are Monday through Saturday, 24 hours a day. Plugging these figures into the simulation program shows a very high percentage of unhappy customers (99%) over the six day week.

To find out how many staff and lines are needed, we now run the simulation for 4 staff/4 lines and 5 staff/5 lines. The results are 86% and 0% unhappy customers, respectively. We repeat the run and now get 78% and 0%. So, as in the real world, the simulation shows variation from week to week, due to the randomness in the arrival of customer calls. The more often one runs the simulation, the better one can estimate the mean levels of unhappy customers.

This program uses the Open Source VPython plotting package (http://vpython.org). Interfacing to VPython and other packages will be explained in a future article on SimPy. If you can't wait for that, download VPython and play with the program, it is simple enough. Enjoy!

There is another real world issue we can address with the simulation: does the level of unhappy customers change over the work week? To analyse the time behavior, we extended the program with graphical output.

We ran the program for the same 3/3, 4/4 and 5/5 options, for 5 simulated weeks. The graphs produced clearly show that in all 5 weeks,

The conclusion for SimPyCo management is that they must increase both the number of lines and the number of manned desks to at least 5 to cope with the current customer arrival rate of 40 calls per hour.

Looking Ahead

With its constructs Process, Resource, and Monitor (which we have not time to explain here), SimPy is a powerful but easy to use LEGO-like toolbox for describing and simulating many classes of complex dynamic systems, whether or not they yet exist. This power is greatly enhanced by the object orientation of Python, allowing easy extension by subclassing and inheriting. This way, generic models of frequently occurring situations ("the production line model", "the service counter model", "the network model", etc.) can be built and then used with little additional code for specific systems. Python gives further power to the SimPy user by being so richly endowed with interfaces to a wide variety of packages (many of which are open source), such as graphics, data bases, spreadsheets, and statistics.

We hope this article gave you a taste of the power and ease of use of SimPy. There are many facets of SimPy we have not touched on. how is SimPy using generators? How does one build OO models? How does one collect and analyze SimPy run data efficiently and elegantly? We will cover some of these in a future article. In the meantime, why not download SimPy, run its tutorial and test programs, and build your own simulation model? It is fun and might be profitable!

Related links

SimPy site: Download SimPy from here, including:

Simscript
Simula
VPython
Fundamentals Of Object-Oriented Simulation

Klaus Müller is an Open Source software developer and a freelance consultant on military information systems.

Tony Vignaux does some teaching in simulation and continues his research interests.


Return to Python DevCenter.

Copyright © 2009 O'Reilly Media, Inc.