At 2005's USENIX conference, I attended a talk by an intrusion response expert. He described a situation where a company had hired him to "find out what happened" to the network during an intrusion: who broke in, how did they get in, and what did they do? Anyone who has been there can tell you just what a difficult job this is, even with the full cooperation of the company. The network manager happily provided the specialist with all the logs, particularly the firewall logs. In only a few minutes, the expert determined that the hoarded and protected firewall logs were completely useless. They were intact, but the logs only listed blocked traffic. They recorded what didn't happen, not what had happened! (Yes, this is where you go check your own firewall logging.)
One of the best ways to avoid a situation like this is to record what actually happened, and that's where Netflow comes in. My previous couple of articles show how to monitor network traffic with Netflow and how to use Netflow, flow-tools, and FlowScan to generate pretty and detailed graphics of your network traffic. Netflow can also provide almost any level of visibility into your network's traffic. Netflow records tell us what did happen. The installation in the first article includes all sorts of tools for that.
People need many different things from Netflow data; they've written countless tools to separate that data in the manner they'd like, and then posted them on the Web to help other people. The end result is dozens of slightly different Netflow data processors, all publicly available. Definitely spend some quality time with Google before writing your own. The rest of this article discusses some tools that I find useful in my day-to-day work, which you have already installed during the Netflow setup discussed in the previous articles.
All these commands nondestructively process flow files. The label
flowfiles indicates where to put these filenames throughout the rest of this article. You can list multiple files or use wildcards. Note that you must either be in the directory containing the flow files, or give the full path to them.
To peer deeply into individual flows, try
flowdumper(1). By default, flowdumper writes all the flows in a file to the screen. I strongly recommend using a pager, as you might have thousands of flows in a single file. Flowdumper requires at least one argument, the flow file to parse.
# flowdumper flowfiles | less
If you're interested in details about traffic to a particular host, you can search within the pager for that IP. Here's a sample of a single flow:
FLOW index: 0xc7ffff router: 172.16.20.1 src IP: 192.168.1.54 dst IP: 10.0.8.3 input ifIndex: 0 output ifIndex: 0 src port: 61521 dst port: 443 pkts: 10 bytes: 3015 IP nexthop: 0.0.0.0 start time: Wed Jul 6 10:47:29 2005 end time: Wed Jul 6 10:48:32 2005 protocol: 6 tos: 0x0 src AS: 0 dst AS: 0 src masklen: 0 dst masklen: 0 TCP flags: 0x1e (PUSH|SYN|ACK|RST) engine type: 0 engine id: 0
Your sensor won't flag flows with BGP data if it's not running BGP. This means that many fields will be blank unless you're collecting flow data from a BGP-using border router. The "router" field is the IP address of the Netflow sensor, which is not necessarily a router. Flow records include the source and destination port and IP address, as well as the number of packets and bytes in this single flow. One interesting set of fields is the start and end times: you can determine whether a large flow used up a lot of bandwidth for a brief time or a trickle of bandwidth for a longer time. The protocol field correlates with the entries in /etc/protocols. If you're using a BGP-speaking router as your sensor, the flow record will include such things as autonomous system (AS) numbers and address mask lengths.
One of flowdumper's most interesting talents is its ability to speak Perl with the
-e flag. Flowdumper uses a whole variety of variables, all defined in
perldoc Cflow. Here are the ones I find most useful, and they're mostly self-explanatory. (The possible exception is
$exporterip, which is the IP address of the Netflow sensor that transmitted this flow.)
$srcip $dstip $srcport $dstport $protocol $tos $exporterip
I find flowdumper's abilities most useful for answering weird, off-the-cuff questions. For example, back when I worked for an ISP, my boss occasionally asked questions like "Who is using a VPN on our network?" Flowdumper answered that quickly. (Too bad I didn't know about Netflow at the time!) Standard internet traffic uses protocols other than TCP, UDP, and ICMP, so we grab everything else.
# flowdumper -e '6 ne $protocol && 17 ne $protocol && 1 ne $protocol' \ flowfiles
Similarly, packets on my network shouldn't have any unusual Type of Service defined. An unusual ToS indicates some sort of nefarious activity.
# flowdumper -e '0x0 ne $tos' flowfiles
Or I could just grab the flows from a particular sensor and see what traffic passed that part of the network.
# flowdumper -e '"192.168.88.134" eq $exporterip' ft-v05.2005-07-06.10*
Another obvious question is "Who is our biggest traffic consumer?" The
flow-stat(1) command allows you to run broad-scale reports on any combination of flow files. The flow-stat man page lists many supported reports. Several of them are not yet implemented, however, and inform you so only when you try to run them. Others strike me as not entirely useful unless you're running BGP. Some of the report formats I find most useful are:
5--TCP/UDP destination port. This counts traffic in both directions, so you must sort it carefully.
10--Source and destination IP. This lets you sort traffic usage between particular machines, so you can identify connections that hog bandwidth.
11--Source or destination IP. This lets you identify particular machines that use large amounts of traffic.
Indicate the report format with the
-s option sorts the results in ascending order, while
-S sorts in descending order. Both take a single argument, the number of the column to sort on. Flow-stat numbers columns starting at 0.
For example, suppose that you need to identify the most heavily used TCP/IP ports on your network. This is flow-stat report format 5. When running a new report, I run it once without any sorting option, just to see what the columns are, and then run it a second time sorted on the desired column. Here I want to sort on column 1, which is the number of flows.
# flow-cat -p flowfiles | flow-stat -f 5 -S 1 | less # --- ---- ---- Report Information --- --- --- # # Fields: Total # Symbols: Disabled # Sorting: Descending Field 1 # Name: UDP/TCP destination port # # Args: flow-stat -f 5 -S 1 # # # port flows octets packets # 443 191969 959951160 2554217 80 42740 345856613 2384044 25 1022 8152412 14890 53 346 57947 730 445 307 20352 424 135 249 11952 249 110 212 111870 2342 44473 203 247298 757 ...
This report actually goes on for several pages, with a "trailing edge" of ports that have had only a few contacts.
You can learn a lot about my network here. The most popular port is 443, for SSL web traffic. Ports 80 (http) and 25 (smtp) are also popular, as well as 53 (dns). I also get a lot of requests for Microsoft protocol ports, 445 and 135. The big surprise for me is port 110; my network doesn't provide POP3 services! I really need to identify where this traffic is coming from and where it's going. I could use flowdumper for that, or write a more complicated report with flow-nfilter.
As Netflow tracks a single TCP/IP connection as two flows (one from the client to the server, and one from the server to the client), you will see lots of smaller requests to off-numbered ports. One flow's source is its mirror's destination, after all.
Perhaps my FlowScan graph displays heavy traffic usage, and I want to see what's going on. One possibility is that a particular client is making especially heavy demands upon one of our servers. I want to see which combinations of clients and servers are using the most flows, which is flow-stat report format 10. I'm sorting on column 2 (flows), in descending order. (Again, I don't know that it's column 2 I want until I run the report unsorted; there is no magic way to extract the knowledge of which column I want from the ether.)
# flow-cat -p flowfiles | flow-stat -f 10 -S 2 | less # --- ---- ---- Report Information --- --- --- # # Fields: Total # Symbols: Disabled # Sorting: Descending Field 2 # Name: Source/Destination IP # # Args: flow-stat -f 10 -S 2 # # # src IPaddr dst IPaddr flows octets packets # 188.8.131.52 192.168.88.230 5167 15766678 52283 192.168.88.230 184.108.40.206 5123 5200858 34541 220.127.116.11 192.168.88.243 4121 27993332 63995 192.168.88.243 18.104.22.168 4112 30655019 53695 22.214.171.124 192.168.88.243 3071 8296022 23541 192.168.88.243 126.96.36.199 3069 13705493 18890 188.8.131.52 192.168.88.230 1533 4718630 16236 192.168.88.230 184.108.40.206 1521 1326074 11375 ...
The 192.168.88 addresses are local servers, and everything else is remote. Obviously, the 220.127.116.11 host is the single biggest client--the first four lines involve that host! One thing to realize is that it's common for these reports to give pairs of records together, especially if you're sorting by flows. My biggest flow user is 18.104.22.168 sending traffic to 192.168.88.230, and the second flow is 192.168.88.230 sending traffic back to 22.214.171.124. That makes sense--the server is answering the client about as often as the client talks to the server. In this case, though, one client IP appears in the first four records; it's obviously using the resources heavily.
Another common question is "Which server receives the most connections?" A report on single hosts is format 11. Sort this by flows again, which is column 1.
# flow-cat -p flowfiles | flow-stat -f 11 -S 1 | less # --- ---- ---- Report Information --- --- --- # # Fields: Total # Symbols: Disabled # Sorting: Descending Field 1 # Name: Source or Destination IP # # Args: flow-stat -f 11 -S 1 # # # IPaddr flows octets packets # 192.168.88.230 224442 459688424 1878817 192.168.88.243 153038 1453679456 2635365 192.168.88.247 34503 124729216 291507
An interesting point here is that the host that receives the most connections is not the host that receives the most octets of traffic or the greatest number of packets. You might find octets (column 2) or packets (column 3) a more sensible measure for your situation. Do whatever works for you.
We can do a lot on the fly with flowdumper and can aggregate data nicely with flow-stat, but the detailed data provided by Netflow should let you do just about any sort of report. You should easily be able to list all SMTP connection attempts to systems that aren't supposed to be SMTP servers. Perhaps you could write the flowdumper Expression from the Infernal Regions to accomplish this, but there are better ways--especially if you intend to use this report repeatedly. The
flow-nfilter(1) program lets you write detailed reports on Netflow data.
Flow-nfilter's configuration relies on "primitives": small definitions that describe one characteristic of traffic--a network port, an IP address, and so on. It assembles these primitives into larger rules that constitute reports. By default, flow-nfilter stores these primitives and definitions in /usr/local/etc/flow-tools/filter.cfg. I will start with primitives and proceed to building actual reports.
Every primitive definition starts with the
filter-primitive label and a name. It then contains the type of primitive. The man page lists many different types of primitives that cover every situation, but this list is too much to read for common situations. (Read the man page for flow-nfilter for a list of the available filtering primitives, but the handful below will get you started.) The most commonly used primitives include
ip-address-prefix. The following example is a primitive for the TCP protocol:
filter-primitive TCP type ip-protocol permit tcp default deny
While its name is
permit statement tells the primitive that it matches anything of protocol type
tcp. You could also use 6, the protocol number for TCP as defined in /etc/protocols. The
default deny at the end means the primitive doesn't match anything that isn't TCP. (Primitives have an implicit
default deny, but I find explicit statements more comfortable.)
ip-port primitive type matches TCP or UDP ports. This primitive,
smtpports, matches connections to port 25 and only port 25:
filter-primitive smtpports type ip-port permit 25 default deny
The adept among you can already see where this is heading; TCP connections to port 25 are either SMTP conversations or masquerading as SMTP conversations.
You can define entire networks with the
ip-address-prefix primitive type. The
hostingnet primitive matches the entire 192.168.88.0 network:
filter-primitive hostingnet type ip-address-prefix permit 192.168.88.0/24 default deny
Finally, you can match individual IP addresses with the
filter-primitive local_mailservers type ip-address permit 192.168.88.33 default deny
Primitives can get quite complicated. For example, suppose I want to make a filter primitive that includes everything that isn't a mail server. The trick to this is to start small and gradually grow.
filter-primitive local_not_mailservers type ip-address deny 192.168.88.33 permit 192.168.88.0/24 default deny
The primitive itself is an IP address--actually, a set of them. It isn't a network, because there's a hole in it. I started by denying my actual mail server's IP. I then accepted the whole block of addresses in the network. At the end, if an IP isn't explicitly permitted, it's denied. This primitive will match any IP address from 192.168.88.0 through 192.168.88.32 and from 192.168.88.34 through 192.168.88.254.
You can list several related items in a single primitive. For example, my company's security policy forbids use of telnet (port 23) and MS SQL (port 1433) connections over the public internet. If I see this traffic on my network, something interesting is going on. This primitive matches both of these.
filter-primitive redflags type ip-port permit 23 permit 1433 default deny
Similarly, you could have a list of IP addresses in permit statements. You can't combine unrelated items in a primitive, however; to combine
port 25 and
protocol tcp, you must create a rule.
Now that you have a set of primitives, you can define filters using them. Filters start with the
filter-definition keyword and a name, then have a series of
match statements. As with primitives, flow-nfilter supports all kinds of statements to match any characteristic of traffic. The match statements I find most useful include
match statement with a primitive only for the traffic type you want to match. For example, if you want to match an IP address, you must list an IP address primitive in the filter definition. You can't match an IP address with a protocol primitive, sensibly enough. You can get a full list of matching types in
flow-nfilter(1), or check the sample filter.cfg file distributed with flow-tools.
To write a filter to catch a certain type of traffic, you must understand what makes that traffic unique. For example, consider a rule to match all email traffic to a SMTP server. What characteristics does an inbound SMTP exchange have? Well, it's on port 25, runs over TCP, and has a destination IP of your mail server. That's enough to write a filter definition.
filter-definition inboundmail match dst-ip-addr local_mailservers match dst-ip-port smtpports match ip-protocol tcp
AND joins these terms. Any flow must match all the primitives for this filter to pick it out.
Similarly, you can write a filter definition that will catch all email originating from your mail server and heading out to other servers. The only difference here is that the destination IP is now the source IP:
filter-definition outboundmail match src-ip-addr local_mailservers match dst-ip-port smtpports match ip-protocol tcp
Using these two filters, you can easily run two reports that pick out all traffic to and from your mail server. Using the
or keyword, you can combine those two reports into a single report:
filter-definition allmail match dst-ip-addr local_mailservers match dst-ip-port smtpports match ip-protocol tcp or match src-ip-addr local_mailservers match dst-ip-port smtpports match ip-protocol tcp
This is just a "superreport" that contains all traffic that meets either of the one-way mail definitions, joined by the
One sort of traffic that I'm particularly interested in is stuff that shouldn't be on my network at all. Nobody should be trying to use
telnet(1) to reach my servers, and nobody should be using MS SQL network connections inbound, either. Anyone who tries to telnet into my servers is by definition a troublemaker. I want to know about them. I've previously defined a
redflags primitive listing TCP/IP ports that I consider bad:
filter-definition redflags match dst-ip-port redflags match dst-ip-addr hostingnet match ip-protocol tcp
Similarly, anyone who tries to send email to an IP that isn't actually a mail server is a troublemaker--at best, he's infected with a spam-relay virus; at worst, he's an intruder. I want to know about him. Specifically, I want my firewall team to know about him so they can block his IP address!
filter-definition inboundmail_false match dst-ip-addr local_not_mailservers match dst-ip-port smtpports match ip-protocol tcp
I can combine the last two reports to create a "bad guys" report, and then run it via
cron(1) on a regular basis and mail the results to the appropriate network minion.
To run flow-nfilter, use
flow-print to dump the binary format from the flow file into a (non-human-readable) format that the other tools can read more easily. Pipe that into
flow-nfilter, which will strip out the flows that don't match your desired flow. Flow-nfilter requires two arguments:
-F and the filter definition, and then
-f with the filename of filter configuration you're using. Finally,
flow-print translates the binary flow format into human-readable text:
# flow-cat -p flowfiles | flow-nfilter -F reportname \ -f /usr/local/etc/flow-tools/filter.cfg | flow-print
For example, here's how to check the flow file ft-v05.2005-07-06.125500-0400 for all inbound email connections. The filename shows that this checks the flows during the 5-minute period between 12:55 and 13 p.m. on July 6, 2005.
# flow-cat -p /var/log/netflows/saved/ft-v05.2005-07-06.125500-0400 \ | flow-nfilter -F inboundmail -f /usr/local/etc/flow-tools/filter.cfg \ | flow-print srcIP dstIP prot srcPort dstPort octets packets 126.96.36.199 10.8.3.199 6 4983 25 497 9 188.8.131.52 10.8.3.199 6 2379 25 509 9 184.108.40.206 10.8.3.199 6 3012 25 232 5 ...
Arrange this in any order you like with
If a particular connection strikes your interest, you can go into further detail on it with a customized flowdumper expression, or use one of the reports built into
flow-print(1). One day, your email administrator will thank you on bended knee for this data--if it occurs to him that it might possibly be available, that is. Fortunately, you can provide this data for any service provided by your network! Once you implement Netflow, you will wonder how you ever solved any problems without it.
Michael W. Lucas
Return to the BSD DevCenter.
Copyright © 2009 O'Reilly Media, Inc.