Developing High Performance Asynchronous IO Applicationsby Stas Bekman
Creating Financial Friction for Spammers
Why do spammers send billions of email messages advertising ridiculous products that most of us would never in our lives consider buying? How can someone possibly make money from this endeavor when the vast majority of spam either gets filtered out or at the very best read and discarded by a disgruntled end user?
What makes spamming profitable is huge volume. Spamming is profitable when a bait message, be it a commercial spam or a phishing email, reaches a substantial number (usually millions) of recipients. According to the New York Times, 0.02% of people click on and buy products advertised in pharmaceutical spam emails. Other articles suggest that it costs about $300 to send 1 million emails, but it's possibly much cheaper to use a DIY botnet. Assuming that a spammer makes just $25 from each sale (and it can be much more than that), it's easy to see that it takes only slightly more than 2 million emails to make an immediate $10K profit. The Times article suggests that pr0n spam gets a 5.6% click rate, though the profits per click are much lower there. The spam problem exists and gets bigger by the day because it takes just a few hundred dollars to send a very large amount of emails, and the payoffs are huge.
Ken Simpson and Will Whittaker, formerly developers at ActiveState, founded MailChannels to solve the spam problem. Rather than trying to invent yet another blacklist or a content filter, they came up with a revolutionary idea. The idea was very simple: rather than trying to fight spammers by detecting spam messages and discarding them, Ken and Will decided to discourage spammers by attacking their economic raison d'etre.
By observing spammer behavior, the MailChannels team realized that spammers are impatient. If they can't deliver a message within several seconds, they tend to abort the connection and move on to spam other servers. After all, spamming is only profitable if spammers can push a lot of email across the wire. The solution used by the Traffic Control product creates that financial friction that everybody was looking for.
Nowadays, the majority of spam is sent from botnets--vast, distributed networks of compromised Windows PCs. Spammers usually rent botnets by the hour from "bot herders" (usually just a bored kid living in his parent's basement). Bot herders even make the spamming software available as a part of the botnet rental package, which makes it easy for the spammer to get to work mailing out to a large list of prospective buyers.
While botnets are vast in size and availability, the number of machines and the sending capacity of any particular botnet is limited. Furthermore, the viability of a particular bot machine decays over time, as receivers such as Hotmail and Yahoo! identify the members of the botnet and black-list them. For these reasons, it is critical for a spammer renting a botnet to get the spam out as quickly as possible to as many recipients as possible--before the bot he rented becomes blacklisted.
By slowing down email from suspicious sources (often botnets), the MailChannels team figured they could probably make the spammers give up and move on. That's exactly what happened.
Notice that I'm not talking about the commonly discussed "grey listing" technique when I use the term "slow down". Traffic Control slows down certain SMTP connections to a trickle (perhaps 5 bytes per second for both upstream and downstream). When slowed down, most spammers voluntarily abort their connection within the first 30 seconds. Legitimate users who experience accidental slowing are unaffected, because their legitimate mail servers don't mind waiting a few minutes to get an email message delivered.
While the idea is simple, the implementation is far from it. Slowed connections tend to pile up like so many cars in a traffic jam. From a traditional receiving mail server's perspective, these connections are a huge burden on memory and process resources. Add in heavy spam content filtering, email archiving, regulatory compliance monitoring, and automated message handling, and the burden of each connection increases even further. In a traditional email environment, slowing down connections is a ridiculous proposition that requires enormous server resources.
In fact, we observed that you don't even need to slow down traffic intentionally to cause loading issues. Many sites these days are barraged with torrents of spam from huge botnets, causing lengthy service outages. The high concurrency causes the damage, rather than a high throughput of messages.
Our challenge was to implement a transparent SMTP proxy what users can install in front of any existing email infrastructure to slow certain connections to a trickle, which is also incredibly scalable with respect to connection concurrency.
The First Generation
We implemented the first generation of Traffic Control using Apache and mod_perl 2 protocol handlers. We have implemented the SMTP protocol (RFC-2821) using the solid infrastructure provided by mod_perl 2. Similar to HTTP, Apache spawns a new server every time a new SMTP connection comes in (unlike HTTP, SMTP is an interactive protocol) and a custom mod_perl SMTP protocol handler takes over and communicates with the client and proxies the connection back to the MTA server (see Figure 1).
Figure 1. The first generation of Traffic Control.
This approach worked really well with a low-traffic volume. But as soon as we put it on a production server that normally received hundreds of concurrent connections, our application couldn't deal with the load. There were two major problems. Because we held certain SMTP connections open for several minutes due to throttling, hundreds of concurrent connections turned into thousands and the machine wasn't capable of running thousands of Apache instances. The other problem was the MTA itself, as we also needed to run thousands of MTA instances each tied to a client via a transparent proxy.
This is a good example of how not to design things. If a certain technology scales well in one domain, it doesn't automatically mean that it will scale as well everywhere else.
The Second Generation
We went back to the drawing board and tried to come up with a solution that would scale well under heavy SMTP traffic. We considered the light-weight front-end, heavy-weight back-end solution familiar to mod_perl users, but it didn't work for us, because we wanted to be able to have Perl in the front end and not waste a lot of time implementing things in C. Besides, it didn't solve the second problem of having MTAs busy, which stayed idle and consumed memory most of the time.
After several brainstorming sessions, we realized that we could solve the problem by having a very light front-end process that could talk SMTP and maintain thousands of throttled and normal SMTP connections. We also realized that we need to implement SMTP multiplexing between the transparent proxy and the MTA, thus allowing a handful of MTAs handle thousands of concurrent SMTP connections, each lasting several minutes. The secret multiplexing sauce came much later, but first we needed to tackle the light-weight front-end problem.
Luckily, I've had a lot of (bad) experience with Perl threads, so it was an quick decision against even attempting to prototype a quick solution. We then hit CPAN in search of good concurrent solutions. I've had a quick fling with coroutines Perl modules (the
Coro:: namespace), but not having any previous experience with those, and not being able to find someone who did, removed that option as well. That left event-based asynchronous solutions.
There were several implementations of event-loop based libraries available on CPAN, and after reviewing our options we decided to use
Event::Lib, which provided Perl bindings around
libevent. We made that choice because we needed a highly portable solution. Not only does
libevent run on multiple platforms, it also supports multiple OS-level implementations on each platform (such as
Event::Lib itself was well documented and had very good test coverage. Most importantly, Tassilo von Parseval, the author of the module, was extremely helpful and prompt at fixing bugs. When we started there were quite a few bugs, but Tassilo has quickly resolved most of them.
To jump forward, the second generation of Traffic Control was a major success and provides an amazing scalability thanks to
Event::Lib and the multiplexing approach. The rest of this article discusses the methodology of writing applications using
One Can't Afford to Block in a Single Threaded Event-based Application
Chances are that you are already familiar with interactive GUI applications, which are event-driven. Event-driven network service applications are quite different beasts. The main difference is that when a user interacts with a GUI application, she creates very few events in a short period of time (for example, clicking on a button), whereas network services normally generate hundreds of events per second (that is, handling HTTP or SMTP traffic on a busy server).
Any blocking operation in network service operation will cause a quick accumulation of pending events, causing an application slowdown and eventual non-responsiveness.
To avoid this situation, you can replace any blocking operations with their non-blocking equivalents (usually asynchronous) where possible. For example, disk and network IO operation can be made non-blocking with operating system support. Any remaining blocking or just slow operations need to be delegated elsewhere. For example, iterating over a long list of objects could significantly slow the whole application down.
If scalable threads support is available, all blocking and slow operations can run in separate threads. Unfortunately, as of this writing Perl 5's thread support is non-scalable (and often not even usable). Therefore, we needed to use some other alternative. We chose to again stick to mod_perl 2 for that purpose, by writing a very simple protocol that delegated blocking and slow operations to the pool of back-end processes.
Figure 2. The second generation of Traffic Control.
We ended up with the architecture shown in Figure 2, where there is a single threaded front-end process which performs lots of non-blocking operations (mainly network IO), and the back-end Apache/mod_perl 2 processes which deal with the slow and blocking operations. The front-end communicates with the back-end processes using a simple protocol.
The front end performs multiplexing and connection pooling to optimize the usage of the back-end and MTA resources. Connection pooling saves the overhead of pre-opening connections. Multiplexing allows multiple client connections to use a few back-end and MTA connections.