Headers are accessed by treating the
Message object like a dictionary. The
Message object preserves the case of header names, but headers are retrieved case-insensitively. Some usage examples:
print 'Number of headers:', len(msg) # Retrieve the message ID msg_id = msg['Message-ID'] print msg_id # Equivalent: retrieval is case-insensitive. msg_id = msg['message-id'] print msg_id # Retrieve subject header, with a default value if # the header isn't present. subject = msg.get('Subject', 'No subject provided') # Retrieve Cc header, returning None if it's not present. cc = msg.get('Cc') # Check if a header is present if 'X-Virus-Scan' not in msg: print 'Doing virus scan...' # Add header value msg['X-Virus-Scan'] = 'OK'
There can be multiple header lines using the same field name; the "Received" header is the most common example. When there are multiple header lines, the
get() method will return a single arbitrarily chosen line. The
get_all() method returns a list of all header values.
set() never overwrites or deletes existing lines; it will always add a new header line.
Here are some examples using the Received header:
# Get list of received headers recv_trail = msg.get_all('Received') for line in recv_trail: print line # Add a new received line; this line will come # last when the message headers are converted to # a string. msg['Received'] = 'from host1 by host2' # Delete all received headers del msg['Received'] # Replace the Subject header msg.replace_header('Subject', '***SPAM*** ' + subject)
See the email package's documentation for a full list of methods and attributes.
Example: A Mailbox to RSS Converter
Putting everything together for an example, the following script uses the
mailbox module and Andrew Dalke's PyRSS2Gen to generate an RSS feed from a mailbox.
#!/usr/bin/env python2.5 import sys, mailbox, datetime from email import utils import PyRSS2Gen if len(sys.argv) == 1: print 'Usage: %s <maildir-1> <maildir-2> ...' % sys.argv sys.exit(1) # Create RSS feed feed = PyRSS2Gen.RSS2(title='Mailbox feed', link='http://maildir-feed.example.com', description=('Contains mailboxes: ' + ' '.join(sys.argv[1:]) )) # Loop over specified mailboxes for filename in sys.argv[1:]: mbox = mailbox.Maildir(filename) for msg in mbox: subject = msg.get('Subject', "") guid_hdr = msg['Message-ID'] # Parse the date, turning it into a datetime object. date_hdr = msg.get('Date') if date_hdr is None: date = datetime.datetime.now() else: (y, month, d, h, min, sec, _, _, _, tzoffset) = utils.parsedate_tz(date_hdr) date = datetime.datetime(y, month, d, h, min, sec) # Create RSS item and add it to the feed item = PyRSS2Gen.RSSItem(pubDate=date, title=subject, guid=PyRSS2Gen.Guid(guid_hdr, isPermaLink=False)) feed.items.append(item) # Write generated RSS to stdout feed.write_xml(sys.stdout, encoding='utf-8')
The examples so far have only examined mailboxes without changing their contents. Let's look at how to add, change, and remove messages from a mailbox.
Before making any alteration to a mailbox, always call the mailbox's
lock() method to acquire a lock on the mailbox. When the changes are complete call the
flush() method to write changes to disk and the
unlock() method to release the lock on the mailbox.
Different mailbox classes will make changes to the underlying disk files at different times. For the single-file mailbox formats, new messages are added immediately but deleted messages aren't removed until you call
flush(). On the other hand, directory-based formats, such as Maildir and MH, make all their changes immediately and the
flush() method doesn't actually do anything. Thanks to Maildir's lock-free design,
unlock() also don't have to do anything.
It's good practice to always call these methods, even if some or all of these methods are no-ops. Someone might come along and modify your code, or pass in a
mbox object where you're expecting a
Maildir object. People are very protective of their e-mail, so you should always be careful to avoid duplicating or worse, deleting messages.