oreilly.comSafari Books Online.Conferences.


Building a Mailing List

by Glenn Fleishman

After programming for about seven hours over a weekend in January, I turned to my wife and said, "I just saved us $10,000 this year." My wife is used to such pronouncements. My income as a freelance writer and entrepreneur is lumpy and comes in strange forms. Without batting an eyelash, she said, "What is it this time?"

Those seven hours of programming, supplemented over the last four months with five other hours, allowed me to develop a rich mailing list system that replaced an outsourced service charging $850 per month. Needless to say, we celebrated: $10,000 in savings doesn't come every day.

I started my first mailing list in 1991, when the Internet was largely an academic province, and I had to rely on the information technology department of the university where I worked to handle the LISTSERV software. The list didn't last long, but it grew to over 400 subscribers worldwide talking about how to solve early desktop publishing problems.

In the intervening 13 years, I've run many lists — almost all of them moderated — and the same issues of list administration that I faced in my first list have recurred each time.

  • The Angry Subscriber: A person who signed up for the list wants off and can't read or follow instructions. He becomes angry and typically abusive and can wind up these days putting you on a blacklist of spammers. (There's also the Mildly Irritated Subscriber or Somewhat Sympathetic Subscriber who wants off or on a list, but doesn't flame you about it, or understands that the list failed or that he can't figure out the instructions.)
  • Bouncy Bouncy: When accounts expire or mail servers stop liking your messages, bounce messages appear. Servers have little standardization in their reporting of delivery failures, often requiring manual intervention to figure out the appropriate user account or accounts to disable.
  • Jokers: People who sign up other users without their permission in single opt-in lists. You must defend against these pranksters or stalkers.

These three categories of user email consumed from two to five hours of my time in the mid-1990s for a list that had 7,000 subscribers on it and much less time for more recent lists managed via Yahoo Groups and other services that automated most bounce processing.

Add to these old-time problems the newer and increasing issues of spam, both incoming and outgoing. Incoming spam can end up on your list if you don't have a mechanism to prevent it; outgoing messages classified as spam can halt your ability to send any mail at all from your mail server.

Last year, I decided to start a subscription, announcement-only list for the posts I and a colleague make to my commercial web log, Wi-Fi Networking News, which runs on Movable Type. I checked the free and paid list services and finally settled on a company that resold Lyris. They would allow me to send out ad-free individual messages and digests and have most of the control and reliability that I needed.

The drawback was that previously mentioned $850 per month. With the ramp-up in subscribers I predicted and an advertising deal already in place that will eventually extend to the mailing list, I suspected this cost would be minor compared to the revenue per subscriber.

Making minor mistakes with a mailing list's behavior can result in your mail server becoming blacklisted, and I thought that an arm's length approach would reduce the chance of my servers running afoul of others. I also didn't want to deal with the Angry Subscriber, Bouncy Bouncy, or the Joker, and in examining the offering I signed up for I felt confident that they would absorb most of the time I'd wasted in running previous lists.

Ultimately, even with the excellent support and services they offered, my commercial host didn't have the flexibility I wanted. For instance, I wanted to have a different template for the digest version of the list and the individual email version, which was impossible. I also wanted to have the default subscription be in digest mode, which my experience shows the majority of users want. Lyris' current version is incapable of offering both double opt-in and a digest default option, making that intractable and irritating users who couldn't follow the precise email-oriented instructions to switch from individual message to digest.

I looked at the code I'd written to extract Movable Type posts and send them to the Lyris-based server and realized that with a little work, I could combine my existing knowledge of sendmail, Apache, and Perl with the MySQL experience I'd gained since last summer when I first started fully understanding how to use it.

My main concern was, of course, Angry Bouncy Jokers, and whether they would drag me down. It turns out, I didn't need to worry about this at all.

Conceptualizing the Subscription System

I've run moderated and announcement lists for so long that I know what friction a good subscription process can remove:

  • Double opt-in. It's absolutely critical in the time of explosive spam that you confirm subscriptions by sending email to the subscribing address and requiring that the user who receives the email take an affirmative action. It provides legal and moral security for you, too.
  • Web and email modality. All operations should be as easy to perform via a web page as through email. Operations can start in either mode and end in either. A subscriber should be able to enter an email address in a web form, receive a confirmation email, and reply via email; or send a subscription email, receive confirmation email, and click a link to confirm; or any combination of those.
  • Default to digest. Most subscribers want digests; most complaints on every list I've run have come from users signed up for individual messages who feel overwhelmed and think they're being "spammed" — as they inevitably describe it. This typically happens in the first day or so after subscribing to the individual email version of a list.
  • Make it easy to unsubscribe. Avoid flaming user meltdowns through frequent, obvious unsubscribe instructions in every message.

The subscription interface I built conforms to these guidelines and has resulted in a single, manual unsubscribe request across 1,800 subscriptions and 200 unsubscribes — and that was due to a bug that failed to set an unsubscribe flag properly. No flames, no manual intervention, no complaints.

Planning Distribution

Related Reading

MySQL Cookbook
By Paul DuBois

The goal of this process is to make it possible for people to choose to receive individual posts made to the web log as email messages. They can choose to receive each post as it's made as a separate email message or, more commonly, as a digest containing all posts sent each night.

Because I use Movable Type to manage my system, I could have used their API to develop a plugin that would activate on new posts to trigger distribution. I'm dubious about having to maintain that plugin, though, so I chose a very low-tech high-tech solution: a crontab entry.

Every five minutes, a cron job queries the MySQL database to find posts made in the previous few minutes. If it finds a new post, it queries the list database for individual message subscribers and sends them each a unique email with a tracking code. (On my MySQL box, that query requires well under 0.001 seconds to run.) Each night before midnight, a cron job queries for all posts sent out in the previous day, generates a header containing the list of posts titles, concentrates, and distributes.

The database schema has three tables. One table contains the email subscriber information, with flags and fields for initial subscription, confirmation, suspension, and unsubscribe status, among others. A second table contains an entry for each Movable Type post sent out and the timestamp for when it was distributed. A third table records bounces, indexing them by the unique table ID for the subscriber table. Here are my schemas:

CREATE TABLE `bounces` (
  `bounce_id`    int(11)  NOT NULL auto_increment,
  `bouncer`      int(11)  NOT NULL default '0',
  `bounce_count` int(11)  NOT NULL default '0',
  `bouncetime`   datetime NOT NULL default '0000-00-00 00:00:00',

  UNIQUE KEY `bounce_id` (`bounce_id`),
  UNIQUE KEY `bouncer`   (`bouncer`),

  KEY `bouncer_2`    (`bouncer`),
  KEY `bouncetime`   (`bouncetime`),
  KEY `bounce_count` (`bounce_count`)

CREATE TABLE `posts` (
  `post_id`   int(11)       NOT NULL auto_increment,
  `mt_id`     int(11)                default NULL,
  `post_time` datetime               default NULL,
  `sent`      enum('y','n') NOT NULL default 'n',

  UNIQUE KEY `post_id` (`post_id`),

  KEY `mt_id`     (`mt_id`),
  KEY `post_time` (`post_time`)

CREATE TABLE `wifisub` (
  `wifi_id`        int(11)               NOT NULL auto_increment,
  `sub_name`       varchar(254)          NOT NULL default '',
  `sub_email`      varchar(254)          NOT NULL default '',
  `subdate`        datetime              NOT NULL default '0000-00-00 00:00:00',
  `confirm_status` enum('y','n','w','u') NOT NULL default 'n',
  `confirm_date`   datetime              NOT NULL default '0000-00-00 00:00:00',
  `confirm_code`   varchar(15)           NOT NULL default '',
  `sub_type`       enum('d','e')         NOT NULL default 'd',
  `suspend`        enum('y','n')         NOT NULL default 'n',
  `unsub`          enum('y','n')         NOT NULL default 'n',
  `unsubdate`      datetime              NOT NULL default '0000-00-00 00:00:00',

  UNIQUE KEY `wifi_id` (`wifi_id`),

  KEY `wifi_name`    (`sub_name`),
  KEY `wifi_email`   (`sub_email`),
  KEY `confirm_code` (`confirm_code`),
  KEY `confirm_date` (`confirm_date`),
  KEY `unsubdate`    (`unsubdate`),
  KEY `subdate`      (`subdate`)

When the code creates a new subscriber, it assigns her a unique, random code. This code isn't so much as a password as a tracking element. Each email sent to the user contains the code as part of the return address and error address. Because the code is used identically each time, the user can add this address to a white list or mark it as a good sender in a challenge/response system.

The reason for embedding the code is to take advantage of sendmail's ability to ignore everything in an address starting at a plus sign. An address such as looks like to sendmail. Using the aliases and virtuserstable files, I redirect incoming email to the subscribe, bounce, and unsubscribe addresses to a program that parses the code and chooses the appropriate behavior to perform.

An important side note: Most sendmail installations are coupled with smrsh, which limits which programs sendmail can execute through pipe from aliases or .forward. The program you want to use needs a symbolic link in /usr/adm/sm.bin/ or a similar path. Check the documentation with your distribution of sendmail. Often, sendmail is looking for a non-existent directory hard-coded into the program. You can change this via or by recompiling with flags set to any other directory you wish.

Incoming mail to one of several addresses ends up in a program that parses the contents and then hands it off via curl to a Perl CGI script. Why use curl? The handoff scripts handle the optional Web part of the subscribe and unsubscribe operation and accessing them via the web interface eliminates a step and doesn't require using SOAP or XML-RPC libraries.

The sendmail-passthrough script has to handle a bit of parsing for bounced messages, which was and continues to be the hardest part of list software. I'll offer more insight into this a little later in a section devoted to bounces.

The four CGI scripts use a Perl DBI interface to talk to the MySQL database and update bounce counts, set suspend flags, and to turn subscription status from pending to confirmed or subscribed to unsubscribed. I store all of the email sent to these addresses in full — except for occasional spam purges — in case of a later complaint or lawsuit so that I can produce precise email with headers to show the opt-in or confirmation.

Pages: 1, 2

Next Pagearrow

Sponsored by: