ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Building Detailed Network Reports with Netflow

by Michael W. Lucas
10/27/2005

At 2005's USENIX conference, I attended a talk by an intrusion response expert. He described a situation where a company had hired him to "find out what happened" to the network during an intrusion: who broke in, how did they get in, and what did they do? Anyone who has been there can tell you just what a difficult job this is, even with the full cooperation of the company. The network manager happily provided the specialist with all the logs, particularly the firewall logs. In only a few minutes, the expert determined that the hoarded and protected firewall logs were completely useless. They were intact, but the logs only listed blocked traffic. They recorded what didn't happen, not what had happened! (Yes, this is where you go check your own firewall logging.)

One of the best ways to avoid a situation like this is to record what actually happened, and that's where Netflow comes in. My previous couple of articles show how to monitor network traffic with Netflow and how to use Netflow, flow-tools, and FlowScan to generate pretty and detailed graphics of your network traffic. Netflow can also provide almost any level of visibility into your network's traffic. Netflow records tell us what did happen. The installation in the first article includes all sorts of tools for that.

People need many different things from Netflow data; they've written countless tools to separate that data in the manner they'd like, and then posted them on the Web to help other people. The end result is dozens of slightly different Netflow data processors, all publicly available. Definitely spend some quality time with Google before writing your own. The rest of this article discusses some tools that I find useful in my day-to-day work, which you have already installed during the Netflow setup discussed in the previous articles.

All these commands nondestructively process flow files. The label flowfiles indicates where to put these filenames throughout the rest of this article. You can list multiple files or use wildcards. Note that you must either be in the directory containing the flow files, or give the full path to them.

flowdumper

To peer deeply into individual flows, try flowdumper(1). By default, flowdumper writes all the flows in a file to the screen. I strongly recommend using a pager, as you might have thousands of flows in a single file. Flowdumper requires at least one argument, the flow file to parse.

# flowdumper flowfiles | less

If you're interested in details about traffic to a particular host, you can search within the pager for that IP. Here's a sample of a single flow:

FLOW
  index:          0xc7ffff
  router:         172.16.20.1
  src IP:         192.168.1.54
  dst IP:         10.0.8.3
  input ifIndex:  0
  output ifIndex: 0
  src port:       61521
  dst port:       443
  pkts:           10
  bytes:          3015
  IP nexthop:     0.0.0.0
  start time:     Wed Jul  6 10:47:29 2005
  end time:       Wed Jul  6 10:48:32 2005
  protocol:       6
  tos:            0x0
  src AS:         0
  dst AS:         0
  src masklen:    0
  dst masklen:    0
  TCP flags:      0x1e (PUSH|SYN|ACK|RST)
  engine type:    0
  engine id:      0

Your sensor won't flag flows with BGP data if it's not running BGP. This means that many fields will be blank unless you're collecting flow data from a BGP-using border router. The "router" field is the IP address of the Netflow sensor, which is not necessarily a router. Flow records include the source and destination port and IP address, as well as the number of packets and bytes in this single flow. One interesting set of fields is the start and end times: you can determine whether a large flow used up a lot of bandwidth for a brief time or a trickle of bandwidth for a longer time. The protocol field correlates with the entries in /etc/protocols. If you're using a BGP-speaking router as your sensor, the flow record will include such things as autonomous system (AS) numbers and address mask lengths.

One of flowdumper's most interesting talents is its ability to speak Perl with the -e flag. Flowdumper uses a whole variety of variables, all defined in perldoc Cflow. Here are the ones I find most useful, and they're mostly self-explanatory. (The possible exception is $exporterip, which is the IP address of the Netflow sensor that transmitted this flow.)

$srcip
$dstip
$srcport
$dstport
$protocol
$tos
$exporterip

I find flowdumper's abilities most useful for answering weird, off-the-cuff questions. For example, back when I worked for an ISP, my boss occasionally asked questions like "Who is using a VPN on our network?" Flowdumper answered that quickly. (Too bad I didn't know about Netflow at the time!) Standard internet traffic uses protocols other than TCP, UDP, and ICMP, so we grab everything else.

# flowdumper -e '6 ne $protocol && 17 ne $protocol && 1 ne $protocol' \
    flowfiles

Similarly, packets on my network shouldn't have any unusual Type of Service defined. An unusual ToS indicates some sort of nefarious activity.

# flowdumper -e '0x0 ne $tos' flowfiles

Or I could just grab the flows from a particular sensor and see what traffic passed that part of the network.

# flowdumper -e '"192.168.88.134" eq $exporterip' ft-v05.2005-07-06.10*
Mastering FreeBSD and OpenBSD Security

Related Reading

Mastering FreeBSD and OpenBSD Security
By Yanek Korff, Paco Hope, Bruce Potter

flow-stat

Another obvious question is "Who is our biggest traffic consumer?" The flow-stat(1) command allows you to run broad-scale reports on any combination of flow files. The flow-stat man page lists many supported reports. Several of them are not yet implemented, however, and inform you so only when you try to run them. Others strike me as not entirely useful unless you're running BGP. Some of the report formats I find most useful are:

Indicate the report format with the -f option.

The -s option sorts the results in ascending order, while -S sorts in descending order. Both take a single argument, the number of the column to sort on. Flow-stat numbers columns starting at 0.

For example, suppose that you need to identify the most heavily used TCP/IP ports on your network. This is flow-stat report format 5. When running a new report, I run it once without any sorting option, just to see what the columns are, and then run it a second time sorted on the desired column. Here I want to sort on column 1, which is the number of flows.

# flow-cat -p flowfiles | flow-stat -f 5 -S 1 | less
#  --- ---- ---- Report Information --- --- ---
#
# Fields:    Total
# Symbols:   Disabled
# Sorting:   Descending Field 1
# Name:      UDP/TCP destination port
#
# Args:      flow-stat -f 5 -S 1 
#
#
# port      flows                 octets                packets
#
443         191969                959951160             2554217             
80          42740                 345856613             2384044             
25          1022                  8152412               14890               
53          346                   57947                 730                 
445         307                   20352                 424                 
135         249                   11952                 249                 
110         212                   111870                2342                
44473       203                   247298                757   
...

This report actually goes on for several pages, with a "trailing edge" of ports that have had only a few contacts.

You can learn a lot about my network here. The most popular port is 443, for SSL web traffic. Ports 80 (http) and 25 (smtp) are also popular, as well as 53 (dns). I also get a lot of requests for Microsoft protocol ports, 445 and 135. The big surprise for me is port 110; my network doesn't provide POP3 services! I really need to identify where this traffic is coming from and where it's going. I could use flowdumper for that, or write a more complicated report with flow-nfilter.

As Netflow tracks a single TCP/IP connection as two flows (one from the client to the server, and one from the server to the client), you will see lots of smaller requests to off-numbered ports. One flow's source is its mirror's destination, after all.

Perhaps my FlowScan graph displays heavy traffic usage, and I want to see what's going on. One possibility is that a particular client is making especially heavy demands upon one of our servers. I want to see which combinations of clients and servers are using the most flows, which is flow-stat report format 10. I'm sorting on column 2 (flows), in descending order. (Again, I don't know that it's column 2 I want until I run the report unsorted; there is no magic way to extract the knowledge of which column I want from the ether.)

# flow-cat -p flowfiles | flow-stat -f 10 -S 2 | less
#  --- ---- ---- Report Information --- --- ---
#
# Fields:    Total
# Symbols:   Disabled
# Sorting:   Descending Field 2
# Name:      Source/Destination IP
#
# Args:      flow-stat -f 10 -S 2 
#
#
# src IPaddr     dst IPaddr       flows       octets                packets
#
105.157.204.33   192.168.88.230   5167        15766678              52283 
192.168.88.230   105.157.204.33   5123        5200858               34541
105.157.204.33   192.168.88.243   4121        27993332              63995
192.168.88.243   105.157.204.33   4112        30655019              53695
109.116.147.7    192.168.88.243   3071        8296022               23541
192.168.88.243   109.116.147.7    3069        13705493              18890
24.105.3.130     192.168.88.230   1533        4718630               16236
192.168.88.230   24.105.3.130     1521        1326074               11375
...

The 192.168.88 addresses are local servers, and everything else is remote. Obviously, the 105.157.204.33 host is the single biggest client--the first four lines involve that host! One thing to realize is that it's common for these reports to give pairs of records together, especially if you're sorting by flows. My biggest flow user is 105.157.204.33 sending traffic to 192.168.88.230, and the second flow is 192.168.88.230 sending traffic back to 105.157.204.33. That makes sense--the server is answering the client about as often as the client talks to the server. In this case, though, one client IP appears in the first four records; it's obviously using the resources heavily.

Another common question is "Which server receives the most connections?" A report on single hosts is format 11. Sort this by flows again, which is column 1.

# flow-cat -p flowfiles | flow-stat -f 11 -S 1 | less
#  --- ---- ---- Report Information --- --- ---
#
# Fields:    Total
# Symbols:   Disabled
# Sorting:   Descending Field 1
# Name:      Source or Destination IP
#
# Args:      flow-stat -f 11 -S 1 
#
#
# IPaddr         flows                 octets                packets
#
192.168.88.230   224442                459688424             1878817
192.168.88.243   153038                1453679456            2635365
192.168.88.247   34503                 124729216             291507

An interesting point here is that the host that receives the most connections is not the host that receives the most octets of traffic or the greatest number of packets. You might find octets (column 2) or packets (column 3) a more sensible measure for your situation. Do whatever works for you.

flow-nfilter

We can do a lot on the fly with flowdumper and can aggregate data nicely with flow-stat, but the detailed data provided by Netflow should let you do just about any sort of report. You should easily be able to list all SMTP connection attempts to systems that aren't supposed to be SMTP servers. Perhaps you could write the flowdumper Expression from the Infernal Regions to accomplish this, but there are better ways--especially if you intend to use this report repeatedly. The flow-nfilter(1) program lets you write detailed reports on Netflow data.

Flow-nfilter's configuration relies on "primitives": small definitions that describe one characteristic of traffic--a network port, an IP address, and so on. It assembles these primitives into larger rules that constitute reports. By default, flow-nfilter stores these primitives and definitions in /usr/local/etc/flow-tools/filter.cfg. I will start with primitives and proceed to building actual reports.

Every primitive definition starts with the filter-primitive label and a name. It then contains the type of primitive. The man page lists many different types of primitives that cover every situation, but this list is too much to read for common situations. (Read the man page for flow-nfilter for a list of the available filtering primitives, but the handful below will get you started.) The most commonly used primitives include ip-protocol, ip-port, ip-address, and ip-address-prefix. The following example is a primitive for the TCP protocol:

filter-primitive TCP
  type ip-protocol
  permit tcp
  default deny

While its name is TCP, the permit statement tells the primitive that it matches anything of protocol type tcp. You could also use 6, the protocol number for TCP as defined in /etc/protocols. The default deny at the end means the primitive doesn't match anything that isn't TCP. (Primitives have an implicit default deny, but I find explicit statements more comfortable.)

Similarly, the ip-port primitive type matches TCP or UDP ports. This primitive, smtpports, matches connections to port 25 and only port 25:

filter-primitive smtpports
  type ip-port
  permit 25
  default deny

The adept among you can already see where this is heading; TCP connections to port 25 are either SMTP conversations or masquerading as SMTP conversations.

You can define entire networks with the ip-address-prefix primitive type. The hostingnet primitive matches the entire 192.168.88.0 network:

filter-primitive hostingnet
  type ip-address-prefix
  permit 192.168.88.0/24
  default deny

Finally, you can match individual IP addresses with the ip-address type:

filter-primitive local_mailservers
  type ip-address
  permit 192.168.88.33
  default deny

Primitives can get quite complicated. For example, suppose I want to make a filter primitive that includes everything that isn't a mail server. The trick to this is to start small and gradually grow.

filter-primitive local_not_mailservers
  type ip-address
  deny 192.168.88.33
  permit 192.168.88.0/24
  default deny

The primitive itself is an IP address--actually, a set of them. It isn't a network, because there's a hole in it. I started by denying my actual mail server's IP. I then accepted the whole block of addresses in the network. At the end, if an IP isn't explicitly permitted, it's denied. This primitive will match any IP address from 192.168.88.0 through 192.168.88.32 and from 192.168.88.34 through 192.168.88.254.

You can list several related items in a single primitive. For example, my company's security policy forbids use of telnet (port 23) and MS SQL (port 1433) connections over the public internet. If I see this traffic on my network, something interesting is going on. This primitive matches both of these.

filter-primitive redflags
  type ip-port
  permit 23
  permit 1433
  default deny

Similarly, you could have a list of IP addresses in permit statements. You can't combine unrelated items in a primitive, however; to combine port 25 and protocol tcp, you must create a rule.

Now that you have a set of primitives, you can define filters using them. Filters start with the filter-definition keyword and a name, then have a series of match statements. As with primitives, flow-nfilter supports all kinds of statements to match any characteristic of traffic. The match statements I find most useful include src-ip-addr, dst-ip-addr, src-ip-port, dst-ip-port, and ip-protocol.

Use a match statement with a primitive only for the traffic type you want to match. For example, if you want to match an IP address, you must list an IP address primitive in the filter definition. You can't match an IP address with a protocol primitive, sensibly enough. You can get a full list of matching types in flow-nfilter(1), or check the sample filter.cfg file distributed with flow-tools.

To write a filter to catch a certain type of traffic, you must understand what makes that traffic unique. For example, consider a rule to match all email traffic to a SMTP server. What characteristics does an inbound SMTP exchange have? Well, it's on port 25, runs over TCP, and has a destination IP of your mail server. That's enough to write a filter definition.

filter-definition inboundmail
  match dst-ip-addr local_mailservers
  match dst-ip-port smtpports
  match ip-protocol tcp

An implicit AND joins these terms. Any flow must match all the primitives for this filter to pick it out.

Similarly, you can write a filter definition that will catch all email originating from your mail server and heading out to other servers. The only difference here is that the destination IP is now the source IP:

filter-definition outboundmail
  match src-ip-addr local_mailservers
  match dst-ip-port smtpports
  match ip-protocol tcp

Using these two filters, you can easily run two reports that pick out all traffic to and from your mail server. Using the or keyword, you can combine those two reports into a single report:

filter-definition allmail
  match dst-ip-addr local_mailservers
  match dst-ip-port smtpports
  match ip-protocol tcp
  or
  match src-ip-addr local_mailservers
  match dst-ip-port smtpports
  match ip-protocol tcp

This is just a "superreport" that contains all traffic that meets either of the one-way mail definitions, joined by the or keyword.

One sort of traffic that I'm particularly interested in is stuff that shouldn't be on my network at all. Nobody should be trying to use telnet(1) to reach my servers, and nobody should be using MS SQL network connections inbound, either. Anyone who tries to telnet into my servers is by definition a troublemaker. I want to know about them. I've previously defined a redflags primitive listing TCP/IP ports that I consider bad:

filter-definition redflags
  match dst-ip-port redflags
  match dst-ip-addr hostingnet
  match ip-protocol tcp

Similarly, anyone who tries to send email to an IP that isn't actually a mail server is a troublemaker--at best, he's infected with a spam-relay virus; at worst, he's an intruder. I want to know about him. Specifically, I want my firewall team to know about him so they can block his IP address!

filter-definition inboundmail_false
  match dst-ip-addr local_not_mailservers
  match dst-ip-port smtpports
  match ip-protocol tcp

I can combine the last two reports to create a "bad guys" report, and then run it via cron(1) on a regular basis and mail the results to the appropriate network minion.

To run flow-nfilter, use flow-print to dump the binary format from the flow file into a (non-human-readable) format that the other tools can read more easily. Pipe that into flow-nfilter, which will strip out the flows that don't match your desired flow. Flow-nfilter requires two arguments: -F and the filter definition, and then -f with the filename of filter configuration you're using. Finally, flow-print translates the binary flow format into human-readable text:

# flow-cat -p flowfiles | flow-nfilter -F reportname \
    -f /usr/local/etc/flow-tools/filter.cfg | flow-print

For example, here's how to check the flow file ft-v05.2005-07-06.125500-0400 for all inbound email connections. The filename shows that this checks the flows during the 5-minute period between 12:55 and 13 p.m. on July 6, 2005.

# flow-cat -p /var/log/netflows/saved/ft-v05.2005-07-06.125500-0400 \
    | flow-nfilter -F inboundmail -f /usr/local/etc/flow-tools/filter.cfg \
    | flow-print
srcIP            dstIP            prot  srcPort  dstPort  octets      packets
82.158.144.145   10.8.3.199      6     4983     25       497         9         
65.34.168.153    10.8.3.199      6     2379     25       509         9         
200.94.218.251   10.8.3.199      6     3012     25       232         5   
...

Arrange this in any order you like with sort(1).

If a particular connection strikes your interest, you can go into further detail on it with a customized flowdumper expression, or use one of the reports built into flow-print(1). One day, your email administrator will thank you on bended knee for this data--if it occurs to him that it might possibly be available, that is. Fortunately, you can provide this data for any service provided by your network! Once you implement Netflow, you will wonder how you ever solved any problems without it.

Michael W. Lucas


Return to the BSD DevCenter.

Copyright © 2009 O'Reilly Media, Inc.