Python DevCenter
oreilly.comSafari Books Online.Conferences.


Getting Started with WSGI
Pages: 1, 2, 3

Flying in the Face of Tradition

To extend these simple examples into something a little more realistic, I'll implement an extremely basic blogging application along RESTful lines: using HTTP GET to retrieve a single entry or a list of entries, PUT to add or update an entry, and DELETE to remove one.

The first step is to extend the BaseWSGI class slightly to handle GET requests in one of two ways: GET / should return a list of all entries, while GET [name] should return a named entry. To provide this, I've added code to the __iter__ method so that when the path requested is /, the text ALL gets appended to the method (meaning a subclass now needs to implement both do_GET and do_GETALL):

if request_method == 'GET' and self.environ['PATH_INFO'] == '/':
    method = method + 'ALL'

At this point, I've decided to store the weblog entries as plain-text files, with nothing in the way of metadata for ordering or filtering. Obviously, in a real application you'd want to be able to search for entries based on particular criteria--perhaps by exposing more meaningful or useful resource URLs (for example, something like /2006/08/my-entry-name)--but for the purposes of this basic application, file-system storage will suffice. Thus, data access for a blog entry is as simple as:

class Entry:
    def __init__(self, path, filename, load=True):
        self.filename = os.path.join(path, filename.replace('+', '-')) + '.txt'
        self.title = filename.replace('-', ' ')
        if load and os.path.exists(self.filename):
            self.text = file(self.filename).read()

    def save(self):
        f = file(self.filename, 'w')

Presenting entries needs some kind of templating. Python has an abundance of choices, such as Cheetah, Kid, and Myghty, not to mention numerous others bundled with the various frameworks. To keep things simple, I'm using a homegrown templating engine that simply injects dynamic content based on the IDs in an XML document. (Given the constraint that all IDs must be unique, this is probably the simplest approach to templating XML, at least from a usage perspective.) Thus, the do_GET method of my application becomes:

def do_GET(self):
    pathinfo = self.environ['PATH_INFO'][1:]
    entry = Entry(blogdir, pathinfo)
    if entry.text:
        (ext, content_type) = self.get_type()
        response_headers = [('Content-type', content_type)]
        if self.status_override:
            status = self.status_override
            status = '200 OK'
        self.start(status, response_headers)
        tmp = self.engine.load('blog-single.' + ext)
        tmp['entry:title'] = entry.title
        tmp['entry:text'] = entry.text
        tmp['entry:link'] = template3.Element(None, 
                href='http://localhost:8080/%s?type=%s' % (entry.title.replace(' ', '-'), ext))
        return str(tmp)
        self.start('404 Not Found', [('Content-type', 'text/html')])
        return '%s not found' % pathinfo

Using the PATHINFO HTTP variable provided by wsgi, I load an entry, then check to see if the text exists; if not, the blog file was not present, so I return a standard 404 Not Found. If the entry loaded successfully, the get_type() method returns the extension to use for the template (and the content type) based on a type parameter passed in the URL. I create the response headers (just content type, for the moment), and start the response process by calling self.start. At this point I've also checked for the presence of status_override, which is a field used when another method calls do_GET (see the do_PUT method later). Finally, I set the content in the template using the IDs: entry:title, entry:text and entry:link. (I'll return to the do_GETALL method shortly.)

The most important method from the WSGI perspective is start. It takes a response code and message, as well as the response headers as a list of tuples. I assigned it from the start_response positional parameter in BaseWSGI.

Adding an Entry

Creating a blog entry calls the do_PUT method, which performs several steps:

  1. Check the pathinfo and for a content-length greater than 0.
  2. Create an Entry object, using the pathinfo.
  3. If the Entry does not contain text, then this is a new blog post, so set the status override variable with "201 Created."
  4. Load the content from the request using the wsgi.input environ variable.
  5. Finally, save the entry, then call the do_GET method to return something meaningful to the caller.
def do_PUT(self):
    pathinfo = self.environ['PATH_INFO'][1:]
    if pathinfo == '':
        self.start('400 Bad Request', [('Content-type', 'text/html')])
        return 'Missing path name'
    elif not self.environ.has_key('CONTENT_LENGTH') or self.environ['CONTENT_LENGTH'] == '' \
            or self.environ['CONTENT_LENGTH'] == '0':
        self.start('411 Length Required', [('Content-type', 'text/html')])
        return 'Missing content'
    entry = Entry(blogdir, pathinfo)
    if not entry.text:
        self.status_override = '201 Created'
    entry.text = self.environ['wsgi.input'].read(int(self.environ['CONTENT_LENGTH']))        
    return self.do_GET()

For a DELETE, I just do the basics: check to see if the entry exists, delete and return a 204 Deleted:

def do_DELETE(self):
    pathinfo = self.environ['PATH_INFO'][1:]
    blogfile = os.path.join(blogdir, pathinfo.replace('+', '-')) + '.txt'
    if os.path.exists(blogfile):
        self.start('204 Deleted', [ ])
        return 'Deleted %s' % pathinfo
        self.start('404 Not Found', [('Content-type', 'text/html')])
        return '%s not found' % pathinfo

The do_GETALL method, which is the only one of the subclass methods that doesn't actually correspond to an HTTP verb, is also the only one that differs from the validation+response cycle established by the other methods. do_GETALL will always return 200 OK, and will read in all .txt files in the specified blog directory, reusing the blog-single template (used in the do_GET method). The main differences between this method and do_GET revolve around templating (and are not particularly relevant to WSGI).


If I were creating a typical GET/POST web application, testing would be straightforward: use a browser. Because I've used REST semantics, I need to use another tool--in this case, Curl--to test all my application's features. The first step is to start up the blog using python, and then:

  • curl -v -X PUT http://localhost:8080/test1 -d @- will add an entry with the title "test1" (-d @- takes input from STDIN-- hit Ctrl+D to stop).
  • The same thing again: curl -v -X PUT http://localhost:8080/test1 -d @- will update that entry. (Notice that the 201 return code should change to a 200).
  • curl -v http://localhost:8080/ will return a list of all entries.
  • curl -v -X DELETE http://localhost:8080/test1 will delete the entry previously created.

I've included three template types: .xhtml for HTML viewing, .xml for simple XML output, and .atom to produce an Atom feed. Test these different templates by calling:

  • curl -v http://localhost:8080/?type=xml
  • curl -v http://localhost:8080/?type=xhtml
  • curl -v http://localhost:8080/?type=atom

Middleware and Utilities

So far I've only demonstrated how to set up a basic, stateless application by extending the foundations provided by WSGI. If you're thinking about larger-scale web application development, the recommended approach is undoubtedly to choose a suitable framework. This is not to say that developing such a webapp is impossible using basic WSGI, but you'll need to add (by hand) a lot of the technology that you get for free with a framework--either by writing your own, or plugging in third-party middleware.

The WSGI perspective on middleware is an important part of the specification. Adding middleware involves wrapping layers of utility code around a base app to provide additional functionality; the PEP calls this a middleware stack. For example, to provide authentication facilities, you might wrap your application with BasicAuthenticationMiddleware; to compress responses, you might wrap it with another middleware component called CompressionMiddleware; and so on.

The Python Paste project provides WSGI middleware and various other useful utilities. As an example of how powerful the concept of middleware is, consider the use of Paste's SessionMiddleware (see for more details):

from paste.session import SessionMiddleware

class myapp2:

    def __init__(self, environ, start_response):
        self.environ = environ
        self.start = start_response

    def __iter__(self):
        session = self.environ['paste.session.factory']()
        if 'count' in session:
             count = session['count']
             count = 1
        session['count'] = count + 1

        self.start('200 OK', [('Content-type','text/plain')])
        yield 'You have been here %d times!\n' % count

app2 = SessionMiddleware(myapp2)

In this example, SessionMiddleware wraps myapp2. When a request comes in, SessionMiddleware adds the session factory to the environ with the key paste.session.factory, and when invoked in the first line of the __iter__ method, the session is returned as a simple dict. A stack of middleware components added to a basic WSGI application means you can have the benefits provided by many of the frameworks, without necessarily having to constrain yourself to a framework.

Pages: 1, 2, 3

Next Pagearrow

Sponsored by: