ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Understanding Network I/O, Part 2

by George Belotsky
02/12/2004

With today's technology, creating your own Internet services can be a relatively easy, one-person project. You may not produce the next Google, but helping your business, not-for-profit organization, school, or friends with a useful Internet application is thoroughly feasible — even on a part-time basis.

In fact, simple Internet clients should take less than a day to write, as described in the previous article on Python network programming. In this second article of a two-part series, we discuss more advanced networking topics, including a set of guidelines for choosing the most suitable approach for your situation.

As in the first article, examples are provided in the Python programming language. Python's clean, elegant syntax is highly suitable for creating compact, easy-to-understand programs. If you require more information about Python, the previous article includes many Python-related links and an installation notes section. Reading the first article will also help you understand the material presented here, but it is not strictly necessary.

Doing Several Things at Once

As discussed in the previous article, network I/O is unpredictable. Sometimes requests will fail outright for some (possibly transient) reason, but often the fault is much more subtle. For example, data might start flowing only after a lengthy delay (high latency), or flow very slowly (low bandwidth). On the Internet, such faults may be caused by a malfunctioning system on another continent — far out of the reach of your application.

Example Files

Download examples and other files related to this article:
python_nio_2.zip or python_nio_2.tar.gz

This unpredictability presents a challenge if your program must perform multiple network operations. For example, servers typically process requests from many clients. It is rarely acceptable to make everyone wait just because one client has trouble. Fortunately, there are powerful, well-tested techniques to deal with such situations.

The key to all of these techniques is that several network I/O operations can be performed concurrently. Thus, we continue to process most requests quickly, even if some requests are delayed due to network-related problems.

Today, there are two basic strategies for concurrency; multitasking and asynchronous I/O. Both techniques are widely applicable to servers, clients, and peer-to-peer systems. To help get you started quickly, the examples presented here extend the ones given in the first article. First, however, we will briefly cover the most basic, fundamental approach: synchronous I/O.

Synchronous I/O

Synchronous I/O is the simplest method for your networked application. Basic synchronous I/O provides no concurrency at all; the program stops at each operation, waiting for it to complete. This technique is sufficient for simple clients. All of the examples in the previous article, except for the one based on Twisted, used synchronous I/O.

Synchronous I/O is easy to test. Other methods introduce many complex subtleties, so initial verification of an application's logic benefits from the use of synchronous I/O. The following program implements a web client that fetches the current outdoor temperature in New York, London, and Tokyo. It is a straightforward modification of example 8 in the previous article.

Example 1. A synchronous I/O client

import urllib  # Library for retrieving files using a URL.
import re      # Library for finding patterns in text.
import sys     # Library for system-specific functionality.

# Three NOAA web pages, showing current conditions in New York,
# London and Tokyo, respectively.
citydata = (('New York','http://weather.noaa.gov/weather/current/KNYC.html'),
            ('London',  'http://weather.noaa.gov/weather/current/EGLC.html'),
            ('Tokyo',   'http://weather.noaa.gov/weather/current/RJTT.html'))

# The maximum amount of data we are prepared to read, from any single page.
MAX_PAGE_LEN = 20000

for name,url in citydata:
     # Open and read each web page; catch any I/O errors.
     try:
          webpage = urllib.urlopen(url).read(MAX_PAGE_LEN)
     except IOError, e:
          # An I/O error occurred; print the error message and exit.
          print 'I/O Error when reading URL',url,':\n',e.strerror
          sys.exit()  

     # Pattern which matches text like '66.9 F'.  The last
     # argument ('re.S') is a flag, which effectively causes
     # newlines to be treated as ordinary characters.
     match = re.search(r'(-?\d+(?:\.\d+)?) F',webpage,re.S)

     # Print out the matched text and a descriptive message;
     # if there is no match, print an error message.
     if match == None:
          print 'No temperature reading at URL:',url
     else:
          print 'In '+name+', it is now',match.group(1),'degrees.'

Here is the output produced by the client.

Example 2. Synchronous I/O client output

In New York, it is now 37.9 degrees.
In London, it is now 46 degrees.
In Tokyo, it is now 48 degrees.

Multitasking

Multitasking is an operating system feature that allows several jobs to be done concurrently. All modern, mainstream operating systems such as Linux, Solaris, Windows, and Mac OS X support multitasking.

Multitasking comes in two basic varieties, processes and threads. The former provide greater isolation between the tasks. In particular, one process cannot overwrite the memory space of another, unless both processes explicitly share a portion of their memory spaces. Thus, damage from a faulty program is (usually) limited to the process in which that program executes.

Threads sacrifice the safety of processes for increased performance. Multiple threads can be started inside a single process; all of these threads then share the process's memory space. The operating system usually does much less work when switching between threads (inside the same process) than when switching between processes.

Many networked applications rely heavily on threads to handle I/O requests in a timely manner. Threads are therefore the focus of the rest of this section. Although programming with threads is difficult, there are plenty of resources available to assist you. Several of them are listed in the Further Reading appendix. If you are not familiar with using threads (or multitasking in general) be sure to read some of the background material before doing any serious work with this technology.

If you are new to multithreaded designs, keep in mind that your goal is actually to use fewer threads and especially fewer locks in your program. This may seem like a contradiction; in particular corrupting unlocked data is quite similar in effect to a dangling reference in a language such as C++. Nevertheless, every extra thread you add increases the system overhead, and every extra lock decreases concurrency.

Ultimately, you are trying to improve concurrency — doing things in parallel — when using threads. Learn to think in terms of adding concurrency rather than adding threads. This will help put you in the right frame of mind for creating effective threaded designs.

Thread-per-Request

The allure of multitasking in networked applications is its similarity to plain synchronous I/O covered earlier. In fact, our threaded examples even use the same urllib library to carry out the I/O, except that now urllib is being called from multiple threads.

One way to add concurrency to the previous example is to launch a separate thread for every city on our list. Each thread will retrieve, process, and display the data for only one city, and then exit.

Servers often employ this method of using threads (or threading model) to process requests. It is therefore commonly referred to as the thread-per-request model. The following example shows an implementation of thread-per-request.

Example 3. A thread-per-request client

import urllib     # Library for retrieving files using a URL.
import re         # Library for finding patterns in text.
import threading  # High-level thread library.

# Three NOAA web pages, showing current conditions in New York,
# London and Tokyo, respectively.
citydata = (('New York','http://weather.noaa.gov/weather/current/KNYC.html'),
            ('London',  'http://weather.noaa.gov/weather/current/EGLC.html'),
            ('Tokyo',   'http://weather.noaa.gov/weather/current/RJTT.html'))

# The maximum amount of data we are prepared to read, from any single page.
MAX_PAGE_LEN = 20000

# Function to be run by each thread.
def read_temperature(name,url,max):
     # Open and read the web page; catch any I/O errors.
     try:
          webpage = urllib.urlopen(url).read(max)
     except IOError, e:
          # An I/O error occurred; print the error message and end the thread.
          print 'I/O Error when reading URL',url,':\n',e.strerror
          return

     # Pattern which matches text like '66.9 F'.  The last
     # argument ('re.S') is a flag, which effectively causes
     # newlines to be treated as ordinary characters.
     match = re.search(r'(-?\d+(?:\.\d+)?) F',webpage,re.S)

     # Print out the matched text and a descriptive message;
     # if there is no match, print an error message.
     if match == None:
          print 'No temperature reading at URL:',url
     else:
          print 'In '+name+', it is now',match.group(1),'degrees.'

# END of function 'read_temperature'

# Launch a separate thread for each request.
for name,url in citydata:
     # Only keyword arguments (of the form 'name=value') may
     # be used with the 'Thread' constructor.  The 'args'
     # specifies arguments to be passed to the 'target'.
     thread = threading.Thread(target=read_temperature,
                               args=(name,url,MAX_PAGE_LEN))
     thread.start()

# The 'threading' package will wait until all the child threads
# (except any that are explicitly labelled as 'daemon threads')
# have shut down before exiting the program.

Here is the output produced by the client.

Example 4. Thread-per-request client output

In London, it is now 46 degrees.
In New York, it is now 37.9 degrees.
In Tokyo, it is now 48 degrees.

The output of this simple example already illustrates the subtleties of adding concurrency to your program. The replies are not in the same order as the requests. Each thread in the example runs independently of the others, so a thread started later might well finish earlier. After all, network I/O is unpredictable, so a slow request can finish long after a fast one, even if the slow request had a head start.

Thread Pool

Another common threading model is the thread pool. A basic thread pool starts a fixed number of threads during initialization and does not shut them down until the program exits. This technique is often more efficient for servers than the thread-per-request model. Thread pools eliminate the overhead of creating and destroying threads in long-running applications which must continuously process requests. In addition, thread pools prevent the situation where a sudden burst of activity causes too many threads to be started, thus exhausting the operating system's resources.

Here is the client from the thread-per-request example, modified to use a thread pool.

Example 5. A thread pool client

import urllib     # Library for retrieving files using a URL.
import re         # Library for finding patterns in text.
import threading  # High-level thread library.
import Queue      # A thread-safe queue implementation.

# Three NOAA web pages, showing current conditions in New York,
# London and Tokyo, respectively.
citydata = (('New York','http://weather.noaa.gov/weather/current/KNYC.html'),
            ('London',  'http://weather.noaa.gov/weather/current/EGLC.html'),
            ('Tokyo',   'http://weather.noaa.gov/weather/current/RJTT.html'))

# The maximum amount of data we are prepared to read, from any single page.
MAX_PAGE_LEN = 20000

# The total number of threads that we will launch for our thread pool.
NTHREADS = 2

# Function to be run by each thread in the pool.  When
# the function returns, the corresponding thread terminates.
def read_temperature(max,inpque,outqueue):
     # Get a city name and URL from the input queue.
     # The thread will wait until input is available.
     name,url = inpque.get()

     # The thread continues to run, until an empty string for the city
     # name is received.  This allows the thread pool to be shut down
     # cleanly.  In addition, Python does not support the killing of
     # threads from outside, so the only way to terminate a thread
     # is to somehow signal it to stop.
     while not (name == ''):
          # Open and read the web page; catch any I/O errors.
          try:
               webpage = urllib.urlopen(url).read(max)
          except IOError, e:
               # An I/O error occurred; place the error message in
               # the output queue.
               outqueue.put('I/O Error when reading URL '+url
                            +' :\n'+str(e.strerror))
          else:
               # Pattern which matches text like '66.9 F'.  The last
               # argument ('re.S') is a flag, which effectively causes
               # newlines to be treated as ordinary characters.
               match = re.search(r'(-?\d+(?:\.\d+)?) F',webpage,re.S)

               # Output the matched text and a descriptive message;
               # if there is no match, output an error message instead.
               if match == None:
                    outqueue.put('No temperature reading at URL: '+url)
               else:
                    outqueue.put('In '+name+', it is now '
                                 +match.group(1)+' degrees.')

          # Get the next name and URL pair.  Will wait if necessary.
          name,url = inpque.get()

     # If we get here, an empty city name has been received.  The last
     # action of the thread is to place the 'None' object in the output
     # queue, to indicate that it has stopped.
     outqueue.put(None)

# END of function 'read_temperature'

# Create the input and output queues.
# Their size is not limited in this example.
inputs  = Queue.Queue(0)
results = Queue.Queue(0)
thread_pool = []         # No threads are currently in the pool.

# Start the thread pool.
for ii in range(NTHREADS):
     # Only keyword arguments (of the form 'name=value') may
     # be used with the 'Thread' constructor.  The 'args'
     # specifies arguments to be passed to the 'target'.
     thread = threading.Thread(target=read_temperature,
                               args=(MAX_PAGE_LEN,inputs,results))
     thread.start()               # Start the thread.
     thread_pool.append(thread)   # Add it to our list of threads.

# Issue requests, by placing them in the input queue.
for item in citydata:
     inputs.put(item)

# Read results from the output queue.  Because requests are processed
# concurrently, the results may come back in a different order from the
# requests.
for ii in range(len(citydata)):
     print results.get()

# Request shut down of the thread pool, by issuing as many empty city
# name requests as there are threads.
for thread in thread_pool:
     inputs.put(('',''))

# The 'threading' package will wait until all the child threads
# (except any that are explicitly labelled as 'daemon threads')
# have shut down before exiting the program.

The output is the same as before. Of course, the order in which the results are returned can change in every run, as previously noted.

A pair of queues are used to exchange data with the thread pool. Just like a lineup at the bank, the most common type of queue operates on a first come, first serve basis. The customers (or pending operations) wait in the queue until a teller (or thread) becomes free. Then, the teller will service the first customer in line.

In the example, all threads wait on the input queue right after being started. The get method of the Queue class is atomic, or indivisible. This ensures that any request will be assigned to only one thread. The assignment of requests to threads continues until there are no more requests, or all the threads in the pool are busy. If there are no more requests, then the remaining threads will continue to wait on the input queue. On the other hand, if there are more requests than threads, then the requests will accumulate in the input queue.

Unless a catastrophic error permanently blocks a thread from running, it will eventually complete its task, and will return to waiting on the input queue. In the example, the last step of handling a request is to place the result in the output queue. We also deliberately start only two threads for our thread pool, so that one of them will have to handle an additional request (since there are three cities for which we need to obtain data). This illustrates the "recycling" of threads in the thread pool model.

In addition to the threads that we start explicitly in our program, the application always has one additional thread, called the main thread. In the example, we use the main thread to read the output queue and print the results. Only one thread in the program ever uses the print statement.

Using a queue and a dedicated task to manage a resource is an alternative technique to sharing that resource with a lock. If the Python print were not thread safe, the current example would still work, but the previous one (which called print without locking from multiple threads) would not.

One important aspect of queues is worth mentioning. As you have probably experienced, lineups at the bank can get very long at busy times, when there are too few tellers working, or when several slow customers arrive at once. In fact, queue lengths grow exponentially as the rate at which new items are added approach the system's limit on being able to remove and process them.

For example, if you use queues in a server to process requests, you may find that the number of items waiting for service suddenly explodes with little warning. In a production system, it may be worthwhile to limit the maximum queue length (a feature that the standard Python Queue module supports) in order to guard against an uncontrolled crash as resources are exhausted. It is also highly useful to keep statistics on the number of waiting requests. These statistics -- combined with the results from stress testing your application -- can provide an early warning of future trouble. The Further Reading appendix refers to additional material regarding queues.

Asynchronous I/O

Asynchronous I/O is a technique specifically targeted at handling multiple I/O requests efficiently. In contrast, threads are a general concurrency mechanism that can be used in situations not related to I/O. Most modern operating systems, such as Linux and Windows, support asynchronous I/O.

Asynchronous I/O works very differently from threads. Instead of having an application spawn multiple tasks (that can then be used to perform I/O), the operating system performs the I/O on the application's behalf. This makes it possible for just one thread to handle multiple I/O operations concurrently. While the application continues to run, the operating system takes care of the I/O in the background.

Due to potentially more efficient, kernel-level I/O processing, the reduction in the total number of threads in the system, and dramatically fewer context switches, asynchronous I/O is sometimes the best method to use. Its major disadvantage is an increase in the complexity of the application's logic — an increase that can be very significant.

Two common ways of asking the operating system to perform asynchronous I/O are the select and poll system calls. While Python provides direct access to these facilities (via the select module), there are easier ways to take advantage of asynchronous I/O in your programs.

In particular, the Twisted framework, as mentioned in the previous article, makes working with asynchronous I/O quite painless in many cases. The next subsection will present a Twisted-based variation on our weather reader example.

The asyncore library provides another alternative to using poll or select directly. This is a lightweight facility that remains sufficiently low-level to give you a good look at the nature of asynchronous I/O. See the The Asyncore Library subsection for details.

The Twisted Framework

Twisted is a large, comprehensive framework. It includes many diverse components such as a web server, a news server, and a web spider client. Achieving I/O concurrency with Twisted is not difficult, as the following example illustrates.

Example 6. A Twisted Framework client

# Import the Twisted network event monitoring loop.
from twisted.internet import reactor
# Import the Twisted web client function for retrieving
# a page using a URL.
from twisted.web.client import getPage

import re      # Library for finding patterns in text.

# Twisted will call this function to process the retrieved web page.
def process_result(webpage,name,url,nrequests):
    # Pattern which matches text like '66.9 F'.  The last
    # argument ('re.S') is a flag, which effectively causes
    # newlines to be treated as ordinary characters.
    match = re.search(r'(-?\d+(?:\.\d+)?) F',webpage,re.S)

    # Print out the matched text and a descriptive message;
    # if there is no match, print an error message.
    if match == None:
        print 'No temperature reading at URL:',url
    else:
        print 'In '+name+', it is now',match.group(1),'degrees.'

    # Keep a shared count of requests (see article text for details).
    nrequests[0] = nrequests[0] - 1 # Just finished a request.
    if nrequests[0] <= 0:  # If this is the last request ...
        reactor.stop()     # ... stop the Twisted event loop.


# Twisted will call this function to indicate an error.
def process_error(error,name,url,nrequests):
    print 'Error getting information for',name,'( URL:',url,'):'
    print error

    # Keep a shared count of requests (see article text for details).
    nrequests[0] = nrequests[0] - 1 # Just finished a request.
    if nrequests[0] <= 0:  # If this is the last request ...
        reactor.stop()     # ... stop the Twisted event loop.


# Three NOAA web pages, showing current conditions in New York,
# London and Tokyo, respectively.
citydata = (('New York','http://weather.noaa.gov/weather/current/KNYC.html'),
            ('London',  'http://weather.noaa.gov/weather/current/EGLC.html'),
            ('Tokyo',   'http://weather.noaa.gov/weather/current/RJTT.html'))

# Initialize the shared count of the number of requests.  This will be
# passed as an argument to the callback functions above.  It cannot
# be a simple integer (see article text for an explanation).
nrequests = [len(citydata)]

# Tell Twisted to get the above pages; also register our
# processing functions, defined previously.
for name,url in citydata:
    getPage(url).addCallbacks(callback = process_result,
                              callbackArgs = (name,url,nrequests),
                              errback = process_error,
                              errbackArgs = (name,url,nrequests))

# Run the Twisted event loop.
reactor.run()

Here is the output produced by the client:

Example 7. Twisted Framework client output

In London, it is now 46 degrees.
In New York, it is now 37.9 degrees.
In Tokyo, it is now 48 degrees.

Note that -- just like with the thread-based examples -- the output may be in a different order from the inputs. After all, asynchronous I/O is also a concurrency technique. As before, when several requests are performed in parallel, the faster ones will tend to pass the slower ones, finishing earlier.

Even with Twisted hiding the low-level details, the event-driven nature of asynchronous I/O is readily apparent in the example. When certain events take place, Twisted calls the functions we have previously supplied. These functions are known as callbacks, because you call the framework to pass the functions to it, and the framework subsequently calls them back. Callbacks are also common in other event-driven systems, such as GUI libraries.

You may not wait inside your Twisted callbacks; it is important to complete the required processing as fast as possible, and return control back to the framework. Any waiting will suspend the other requests, because only a single thread is doing all of the work.

Twisted defines a special construct, Deferred , for triggering callbacks. In our example, the getPage function actually returns a Deferred object. We then use its addCallbacks method use to register our result-processing and error-handling functions.

The last line in our program (reactor.start()) starts the Twisted event loop. This transfer of control is common in event-driven systems; it allows the framework to invoke our callbacks in response to events. Our program will terminate when reactor.start() returns.

Now that we have surrendered control to Twisted, however, how can we make reactor.start() return? The example issues a reactor.stop() from either of our two callbacks. In order to prevent Twisted from exiting prematurely, we keep track of how many requests are left to process and only call reactor.stop() after all the requests have been processed.

We store the count of outstanding requests in a standard Python list that we specify as a parameter to our callbacks. In your own code, you may want to create a counter class for this purpose. Alternately, you can write your callbacks as methods of a class, keeping the count in an attribute. In any case, do not pass a simple integer to the callback. Any changes you make to such a type inside the function will be purely local, and will be discarded when the callback returns. See the Python Reference Manual for a deeper understanding of these issues.

Although all invocations of the callback share the count, we need no locking to protect the value. Only one thread makes every call, so each invocation must complete before the next one can start. This ensures that the count is always consistent, because no one operation on it may preempt another already in progress.

The Asyncore Library

Asyncore is another Python project for dealing with asynchronous I/O. In contrast to the large, comprehensive Twisted, asyncore is small, lightweight, and included as part of the standard Python distribution. You may also be interested in asynchat (also included with Python), which provides extra functionality on top of asyncore. The well-known Zope project, a powerful, sophisticated, web-application server is built using asyncore.

Asyncore's minimalist approach comes at a price. While this library is higher level than using select or poll directly, it does not provide additional facilities such as HTTP protocol support. Asyncore is a fundamental building block, with a tight focus on just the I/O process itself.

The asyncore documentation includes an easy-to-follow web client example. It is immediately clear from this example that we must do all the work pertaining to the HTTP protocol ourselves. Asyncore provides only the I/O channel. The example also illustrates how to use asyncore in our programs: by writing a class that inherits from a base class supplied by the library.

Now we are ready to reimplement our weather reader with asyncore. Ideally, we would like to reuse the code from the Twisted client. After all, neither the logic of our program nor the underlying I/O method will change. In the following example, we substitute our own (asyncore-derived) CustomDispatcher class for the facilities previously provided by Twisted, leaving the rest of our code virtually intact.

Example 8. An Asyncore client

import asyncore # Lightweight library for asynchronous I/O. 
import re       # Library for finding patterns in text.

# Our asyncore-based dispatcher class.
import CustomDispatcher 

# Function to process the retrieved web page.
def process_result(webpage,name,url):
   
    # Pattern which matches text like '66.9 F'.  The last
    # argument ('re.S') is a flag, which effectively causes
    # newlines to be treated as ordinary characters.
    match = re.search(r'(-?\d+(?:\.\d+)?) F',webpage,re.S)

    # Print out the matched text and a descriptive message;
    # if there is no match, print an error message.
    if match == None:
        print 'No temperature reading at URL:',url
    else:
        print 'In '+name+', it is now',match.group(1),'degrees.'

# Function to indicate an error.
def process_error(error,name,url):
    print 'Error getting information for',name,'( URL:',url,'):'
    print error

# Three NOAA web pages, showing current conditions in New York,
# London and Tokyo, respectively.
citydata = (('New York','http://weather.noaa.gov/weather/current/KNYC.html'),
            ('London',  'http://weather.noaa.gov/weather/current/EGLC.html'),
            ('Tokyo',   'http://weather.noaa.gov/weather/current/RJTT.html'))

# Create one asyncore-based dispatcher for each of the above pages;
# also register our callback functions, defined previously.
for name,url in citydata:

    # No need to save the result of the constructor call, because
    # asyncore keeps a reference to our dispacher objects.
    CustomDispatcher.CustomDispatcher(url,
                                      process_func = process_result,
                                      process_args = (name,url),
                                      error_func = process_error,
                                      error_args = (name,url))

# Run the asyncore event loop.  The loop will terminate automatically
# once all I/O channels have been closed.
asyncore.loop()

The output is the same as in the Twisted example (of course, the order of the results returned may be different for each run). In addition, the code to stop the Twisted reactor is no longer required; asyncore will automatically exit its loop when all I/O channels have closed.

Most of the work required to create the asyncore example is actually in writing the CustomDispatcher class. Due to the amount of low-level details it must handle, CustomDispatcher is quite a long piece of code compared to the other programs shown in this article. You can download it from the previous link or read it in the appendix

CustomDispatcher strives to be a fairly complete example that is also compatible with several versions of Python and asyncore. In addition, the goal is to write simple code that makes it easier to understand the nature of asynchronous I/O, rather than come up with the most optimal implementation.

As mentioned in the Twisted discussion, programs relying on asynchronous I/O are event-driven by nature. This is certainly different from the threaded examples given earlier. The CustomDispatcher class is sufficiently low-level to clearly bring out these differences.

When using synchronous I/O with threads, the physical layout of the program can correspond closely with its internal logic. For instance, each thread in our multitasking examples performs, in order, the following tasks:

  1. Send a request to the server.
  2. Receive the response.
  3. Process the response and output the results.

All of these operations can be written naturally, from the top down, in the program's source code. While urllib takes care of the first two steps in our examples, it still does so in the context of the threads we create.

As a thread performs the first two steps, it may have to wait an unpredictable amount of time for the network I/O to complete. When a thread is waiting, the operating system will allow other threads to run. Thus, if one of the threads has entered a lengthy sleep (e.g., in step 2), it will not prevent the other threads from performing step 3.

The situation changes completely when we use asynchronous I/O. Now, waiting is not allowed -- there is only one thread doing all the work. Instead, we perform I/O operations only when the operating system tells us that they will succeed immediately.

For each such I/O event, it is entirely likely that we will write less data than is needed to complete our request, or read only part of the incoming reply, etc. The unfinished work will have to be continued when the next I/O event comes. We must therefore store enough state information in order to resume the partially completed operation correctly at a later time.

Asyncore translates the results of the low-level system call (select or poll) into calls to handle_read, handle_write, and so on. We provide these methods in our CustomDispatcher class.

Our class is also a great place to keep state information. In particular, note the __is_header member variable. It is used as a flag to indicate that we have not yet finished reading the HTTP header.

Due to the nature of asynchronous I/O, it is likely that handle_read will be called multiple times before the entire web page is read. In addition, one of these read operations will probably wind up reading the last part of the HTTP header and the first part of the body. After all, the low-level asynchronous I/O routines are not familiar with the HTTP protocol. The transition from header to body is meaningless to them. Our handle_read method must carefully preserve any body content as it discards the header; otherwise part of the information we are interested in would be lost. Keep these sorts of issues in mind when working with asynchronous I/O.

When writing your own asyncore-based dispatcher classes, you may also want to override the handle_expt, handle_error, and log methods. These methods deal with Out-Of-Band data (OOB), unhandled errors, and logging, respectively. See the asyncore documentation and the library source code itself (file asyncore.py, installed on your hard drive in the same place as the rest of the standard Python library) for more information. The asyncore source code is actually quite easy to read. Also note that OOB is a rarely used feature of the TCP/IP protocol family.

CustomDispatcher uses Python's built-in apply function to call the supplied callback functions. This allows the list of arguments to the callbacks to be generated dynamically. Note, however, that apply has been deprecated in Python version 2.3. Unless you want to support old versions of the language (notably version 1.5), you should use the extended call syntax to achieve the same result. See the documentation of the deprecated apply function for a description of the extended call syntax.

Which Method to Use?

After looking at the various techniques for network I/O, it is time to formulate some guidelines about which method best fits a particular situation.

For simple clients, plain synchronous I/O is probably the best choice. For example, if you need to extract information from a single web page, the added complexity of threads or asynchronous I/O is not likely to provide many advantages. At most, you might use one additional thread so that the GUI remains responsive even if the network I/O takes a long time.

Synchronous I/O has simplicity on its side. It is the least subtle, most straightforward method. This significantly reduces the chances of programmer error — a powerful advantage that should never be overlooked.

Of course, a server that needs to handle multiple concurrent requests or a complex client (such as a web spider) that queries many separate servers to generate a result, need something more than plain synchronous I/O.

One excellent source of ideas on network I/O is the ADAPTIVE Communication Environment (ACE) project. ACE is a highly mature Open Source system used in a variety of demanding applications (including medical imaging, military avionics, and industrial process control).

In many striking ways, ACE did for C++ what Twisted has, more recently, done for Python. For example, the Reactor design pattern — an important part of Twisted — appeared earlier in ACE. While Twisted is strongly focused on asynchronous I/O, it, like ACE, also provides support for threads (including the thread pool model).

ACE is a giant, comprehensive, and portable framework with many significant capabilities, including real-time performance. In the process of creating this system, the ACE team has published many papers detailing their insights into network I/O. These papers — regardless of whether you use ACE — are very helpful in planning your project.

The ACE paper comparing the performance of various network I/O models (PDF) is of particular interest. This is quite an accessible work, which presents several highly informative results. The following list provides a brief summary:

The paper contains many graphs of experimental data, which makes it easy to compare the characteristics of different I/O techniques. In addition, the paper presents several interesting variations (such as thread pools that use asynchronous I/O) on the standard I/O methods.

When using Python, however, the Twisted team has raised some concerns about the efficiency of threads. This has to do with the internals of the Python interpreter, particularly the Global Interpreter Lock (GIL). Python itself is not fully thread safe; the GIL is therefore used to protect critical regions. As mentioned earlier, locking has an adverse effect on concurrency. It is therefore reasonable that the Twisted team chose asynchronous I/O as the basis for their framework.

Guidelines for Choosing the I/O Model

Based on the observations presented thus far, here are some useful guidelines for choosing a network I/O model:

If you decide to use C or C++ in order to create fast thread-based implementations, keep in mind that you may not have to give up Python altogether. There are many ways of combining Python with C and C++. See Further Reading for details.

Sometimes, an approach combining multiple languages is advantageous because it becomes possible to emphasize the strengths of each in a hybrid system. In particular, Python/C++ hybrids are well suited for creating elegant, very high-performance systems that are also easy to modify and extend.

Comprehensive Frameworks or Lightweight Libraries?

There is no question that comprehensive frameworks, such as ACE and Twisted, can make it significantly easier to create networked applications. When should you go with such a framework, and when are simpler tools (such as asyncore) more appropriate?

While the ultimate choice depends on the analysis of your specific needs, here are some suggestions:

Don't Forget About UDP

No discussion of Internet-based network I/O would be complete without mentioning the UDP protocol. This often overlooked facility can provide enormous performance advantages (10 times or more over TCP virtual circuits in the author's experience).

If you are designing your own network protocol, do not automatically assume that you need TCP. There are situations for which the tradeoffs made by TCP are not the right ones. This is why UDP exists.

For example, a system that samples data at a high rate could benefit greatly from a UDP-based approach. In this case, there is no need to resend lost messages; they would arrive too late anyway. Instead, the system can resynchronize itself during a subsequent successful sample. A simple alarm mechanism whenever too many samples are missing may actually be all that is required for error handling.

Further Reading

This appendix lists various sources where you can find more information on the topics covered in the article. If you are interested in Python, how to install it, how to install Twisted and similar material, the previous article provides an overview and many links. It also contains more introductory discussion on network I/O.

The Wikipedia page on Computer Multitasking provides a good overview of this topic. Another good source of information is the documentation for Python's standard thread modules: the low-level thread and the higher-level threading. The book Programming Python, 2nd Edition (March 2001) also offers a very helpful discussion of threads in section 3.3. The book is available through Safari Online (there is a 14-day free trial if you never had an account) as well as in print form. If you just want to read the section on threads, then Safari Online is probably your best option.

Queues are a very useful construct, especially for implementing networked applications. Queues are often quite easy to work with, but are still a very active research area with many unsolved problems. The study of how queues behave is formally known as queueing theory. MIT OpenCourseWare offers a lot of freely accessible information if you are interested in the theoretical foundations of the subject. You may also download SimPy, an easy-to-use simulation package that includes models of queues as examples.

The documentation for asyncore provides a good overview of asynchronous I/O. Twisted also includes a useful asynchronous I/O document. You may also be interested in one of the ACE papers on the Proactor design pattern. The Proactor is somewhat similar to the Twisted Deferred. You should likewise take a look at a Deferred overview if you plan to use Twisted.

The ACE project is a major source of experimentally verified data on network I/O. There are also several other projects related to ACE, including TAO, a real-time capable CORBA ORB. The ACE homepage provides links to TAO and other ACE-related work.

There are many ACE-related books, manuals, and papers that you may find helpful regardless of which particular library you choose for your project. Of course, if you are programming a networked application in C++, you should seriously consider using ACE itself.

Several projects deal with creating systems using Python and C/C++. The frontrunner for Python/C++ hybrids appears to be Boost.Python. There is also PyCXX (although its author is now urging users to consider Boost.Python instead), Pyrex (which supports C only, not C++), and SCXX (a lightweight approach originally inspired by PyCXX).

Python itself includes support for writing C/C++ extensions, as well as embedding the Python interpreter in a C/C++ program (see Extending and Embedding the Python Interpreter). The SWIG project is also well-known; it combines C/C++ with a variety of other languages (including Python, Perl, and Tcl).

Finally, see section 5 of The Design Philosophy of the DARPA Internet Protocols for brief but insightful examples in which UDP is preferable to TCP. This paper was also mentioned in the first article, because it is overall very worthwhile reading for anyone dealing with networked systems.

George Belotsky is a software architect who has done extensive work on high-performance internet servers, as well as hard real-time and embedded systems.


Return to the Python DevCenter.

Copyright © 2009 O'Reilly Media, Inc.