Network Impact of the MS SQL Wormby Iljitsch van Beijnum, author of BGP
Here in Europe, the MS SQL worm hit early Saturday morning. Of the four networks that I manage or help manage, three had a small number of hosts that were infected. On each network, the results were different.
On the first network, the Cisco 7200 routers became extremely slow because of all the traffic generated by the worm. This was so bad the router sometimes didn't get around to sending BGP keep-alive packets in time, so the BGP session broke. This meant the router was unable to advertise the network's IP address ranges to the rest of the world, with the result that these addresses became unreachable.
Under normal circumstances, a Cisco 7200 router can forward several hundred megabits worth of traffic. However, the packets the worm generates are relatively small (404 bytes from what I observed), and because one of the routers in this network was starved for memory, it didn't run the Cisco Express Forwarding (CEF) algorithm. This made the problem worse because without CEF, the router must take time to create a "route cache" for each destination in order to forward packets at high speed. Since the worm generates random destination addresses, the router ended up spending most of its time creating these route cache entries, and it ran out of memory to boot. Most important lesson: always use packet forwarding algorithms with good worst-case properties; cache or flow-based algorithms are dangerous.
But I'm getting ahead of things: I didn't know I was dealing with a worm at that time. We suspected a denial-of-service (DoS) attack, but when I looked at the traffic counters I noticed there was more traffic coming out of the network than going in! I quickly installed a filter for logging purposes:
! access-list 140 permit ip any any log-input ! interface ethernet0/0 ip access-group 140 in !
This filter doesn't filter anything, but it does log a sample of the traffic coming in on the interface to the router's log buffer:
7w1d: %SEC-6-IPACCESSLOGP: list 140 permitted udp 192.168.3.172(1189) (Ethernet0 0001.0229.23b6) -> 188.8.131.52(1434), 1 packet
I could now see which hosts were sending out the offending traffic (something like 70 megabits of it) so I could filter all packets with those hosts as the source:
! access-list 140 deny ip host 192.168.3.172 any access-list 140 permit ip any any !
Obviously, inspecting packets to see if they should be forwarded or dropped (filtered out) takes additional CPU time, but dropping the packet immediately as it comes in on an interface takes much less CPU time than forwarding it or trying to forward it. So with this filter in place the network returned to normal, more or less. But I suspected there was more to this, so I had a look at the most recent messages posted to the North American Network Operators Group (NANOG) mailing list. There, people mentioned a new worm operating on UDP port 1434. So I changed the filter to look for UDP packets destined for port 1434:
! access-list 140 deny udp any any eq 1434 access-list 140 permit ip any any !
And this caught the worm's packets very well:
show access-list 140 Extended IP access list 140 deny udp any any eq 1434 (87352 matches) permit ip any any (3103 matches)
It turned out that the entire code for the MS SQL worm is only a few hundred bytes and it's contained in a single UDP packet. The packet contains a request to the Microsoft SQL service that is malformed: a value is longer than it should be. The server doesn't check for this properly, so the excess data overwrites stack memory pointing to instructions that should be executed later.
This is how the worm tricks the server into executing the code contained in the packet. The executable code in the worm then simply generates random addresses and sends out copies of itself to these addresses. Unlike the TCP protocol that is used for most applications, such as the web and mail, UDP doesn't require a session establishment phase; a client request or server response can be contained in a single packet, so there is no waiting for a response before the next packet can be sent. Infected hosts simply send out copies of the worm as fast as they can. This easily can be upwards of 10,000 packets per second.
Another network I manage had just upgraded to fast Cisco 12000 routers. These could easily handle the additional 100 megabits worth of worm traffic as well as the typical 50 megabits or so of early Saturday morning traffic. I installed a filter anyway to spare the recipients the unwanted traffic. At one point on Saturday, 25 percent of all packets coming in were copies of the worm. And strangely enough, after the infected hosts were cleaned up, there were still one or two copies of the worm coming in from the internal network. It looks like the worm was also successful in spreading over broadcast and multicast addresses. This should prove an interesting subject for future study.
As is the case for very aggressive biological viruses (such as Ebola), the worm's success was also its downfall: each packet generated by the worm was targeted at the Microsoft SQL service, which lives on a (relatively) fixed UDP port. This makes the worm packets easy to filter out. The very high traffic volumes make it virtually impossible for the worm to remain undetected. A worm with a different agenda would have to spend resources doing other things than simply spreading, so it wouldn't spread as fast as this worm did--this one seems to have infected pretty much the entire Internet within a minute or so.