Flying in the Face of Tradition
To extend these simple examples into something a little more realistic, I'll implement an extremely basic blogging application along RESTful lines: using HTTP GET to retrieve a single entry or a list of entries, PUT to add or update an entry, and DELETE to remove one.
The first step is to extend the
BaseWSGI class slightly to handle GET requests in one of two ways:
GET / should return a list of all entries, while
GET [name] should return a named entry. To provide this, I've added code to the
__iter__ method so that when the path requested is
/, the text
ALL gets appended to the method (meaning a subclass now needs to implement both
if request_method == 'GET' and self.environ['PATH_INFO'] == '/': method = method + 'ALL'
At this point, I've decided to store the weblog entries as plain-text files, with nothing in the way of metadata for ordering or filtering. Obviously, in a real application you'd want to be able to search for entries based on particular criteria--perhaps by exposing more meaningful or useful resource URLs (for example, something like
/2006/08/my-entry-name)--but for the purposes of this basic application, file-system storage will suffice. Thus, data access for a blog entry is as simple as:
class Entry: def __init__(self, path, filename, load=True): self.filename = os.path.join(path, filename.replace('+', '-')) + '.txt' self.title = filename.replace('-', ' ') if load and os.path.exists(self.filename): self.text = file(self.filename).read() def save(self): f = file(self.filename, 'w') f.write(self.text) f.close()
Presenting entries needs some kind of templating. Python has an abundance of choices, such as Cheetah, Kid, and Myghty, not to mention numerous others bundled with the various frameworks. To keep things simple, I'm using a homegrown templating engine that simply injects dynamic content based on the IDs in an XML document. (Given the constraint that all IDs must be unique, this is probably the simplest approach to templating XML, at least from a usage perspective.) Thus, the
do_GET method of my application becomes:
def do_GET(self): pathinfo = self.environ['PATH_INFO'][1:] entry = Entry(blogdir, pathinfo) if entry.text: (ext, content_type) = self.get_type() response_headers = [('Content-type', content_type)] if self.status_override: status = self.status_override else: status = '200 OK' self.start(status, response_headers) tmp = self.engine.load('blog-single.' + ext) tmp['entry:title'] = entry.title tmp['entry:text'] = entry.text tmp['entry:link'] = template3.Element(None, href='http://localhost:8080/%s?type=%s' % (entry.title.replace(' ', '-'), ext)) return str(tmp) else: self.start('404 Not Found', [('Content-type', 'text/html')]) return '%s not found' % pathinfo
PATHINFO HTTP variable provided by
wsgi, I load an entry, then check to see if the text exists; if not, the blog file was not present, so I return a standard
404 Not Found. If the entry loaded successfully, the
get_type() method returns the extension to use for the template (and the content type) based on a
type parameter passed in the URL. I create the response headers (just content type, for the moment), and start the response process by calling
self.start. At this point I've also checked for the presence of
status_override, which is a field used when another method calls
do_GET (see the
do_PUT method later). Finally, I set the content in the template using the IDs:
entry:link. (I'll return to the
do_GETALL method shortly.)
The most important method from the WSGI perspective is
start. It takes a response code and message, as well as the response headers as a list of tuples. I assigned it from the
start_response positional parameter in BaseWSGI.
Adding an Entry
Creating a blog entry calls the
do_PUT method, which performs several steps:
- Check the
pathinfoand for a
content-lengthgreater than 0.
- Create an
Entryobject, using the
- If the
Entrydoes not contain text, then this is a new blog post, so set the status override variable with "201 Created."
- Load the content from the request using the
- Finally, save the entry, then call the
do_GETmethod to return something meaningful to the caller.
def do_PUT(self): pathinfo = self.environ['PATH_INFO'][1:] if pathinfo == '': self.start('400 Bad Request', [('Content-type', 'text/html')]) return 'Missing path name' elif not self.environ.has_key('CONTENT_LENGTH') or self.environ['CONTENT_LENGTH'] == '' \ or self.environ['CONTENT_LENGTH'] == '0': self.start('411 Length Required', [('Content-type', 'text/html')]) return 'Missing content' entry = Entry(blogdir, pathinfo) if not entry.text: self.status_override = '201 Created' entry.text = self.environ['wsgi.input'].read(int(self.environ['CONTENT_LENGTH'])) entry.save() return self.do_GET()
For a DELETE, I just do the basics: check to see if the entry exists, delete and return a 204 Deleted:
def do_DELETE(self): pathinfo = self.environ['PATH_INFO'][1:] blogfile = os.path.join(blogdir, pathinfo.replace('+', '-')) + '.txt' if os.path.exists(blogfile): os.remove(blogfile) self.start('204 Deleted', [ ]) return 'Deleted %s' % pathinfo else: self.start('404 Not Found', [('Content-type', 'text/html')]) return '%s not found' % pathinfo
do_GETALL method, which is the only one of the subclass methods that doesn't actually correspond to an HTTP verb, is also the only one that differs from the validation+response cycle established by the other methods.
do_GETALL will always return 200 OK, and will read in all .txt files in the specified blog directory, reusing the
blog-single template (used in the
do_GET method). The main differences between this method and
do_GET revolve around templating (and are not particularly relevant to WSGI).
If I were creating a typical GET/POST web application, testing would be straightforward: use a browser. Because I've used REST semantics, I need to use another tool--in this case, Curl--to test all my application's features. The first step is to start up the blog using
python blog.py, and then:
curl -v -X PUT http://localhost:8080/test1 -d @-will add an entry with the title "test1" (
-d @-takes input from STDIN-- hit
Ctrl+D to stop).
- The same thing again:
curl -v -X PUT http://localhost:8080/test1 -d @-will update that entry. (Notice that the 201 return code should change to a 200).
curl -v http://localhost:8080/will return a list of all entries.
curl -v -X DELETE http://localhost:8080/test1will delete the entry previously created.
I've included three template types: .xhtml for HTML viewing, .xml for simple XML output, and .atom to produce an Atom feed. Test these different templates by calling:
curl -v http://localhost:8080/?type=xml
curl -v http://localhost:8080/?type=xhtml
curl -v http://localhost:8080/?type=atom
Middleware and Utilities
So far I've only demonstrated how to set up a basic, stateless application by extending the foundations provided by WSGI. If you're thinking about larger-scale web application development, the recommended approach is undoubtedly to choose a suitable framework. This is not to say that developing such a webapp is impossible using basic WSGI, but you'll need to add (by hand) a lot of the technology that you get for free with a framework--either by writing your own, or plugging in third-party middleware.
The WSGI perspective on middleware is an important part of the specification. Adding middleware involves wrapping layers of utility code around a base app to provide additional functionality; the PEP calls this a middleware stack. For example, to provide authentication facilities, you might wrap your application with
BasicAuthenticationMiddleware; to compress responses, you might wrap it with another middleware component called
CompressionMiddleware; and so on.
The Python Paste project provides WSGI middleware and various other useful utilities. As an example of how powerful the concept of middleware is, consider the use of Paste's
SessionMiddleware (see test3.py for more details):
from paste.session import SessionMiddleware class myapp2: def __init__(self, environ, start_response): self.environ = environ self.start = start_response def __iter__(self): session = self.environ['paste.session.factory']() if 'count' in session: count = session['count'] else: count = 1 session['count'] = count + 1 self.start('200 OK', [('Content-type','text/plain')]) yield 'You have been here %d times!\n' % count app2 = SessionMiddleware(myapp2)
In this example,
myapp2. When a request comes in, SessionMiddleware adds the session factory to the
environ with the key
paste.session.factory, and when invoked in the first line of the
__iter__ method, the session is returned as a simple
dict. A stack of middleware components added to a basic WSGI application means you can have the benefits provided by many of the frameworks, without necessarily having to constrain yourself to a framework.