Another obvious question is "Who is our biggest traffic consumer?" The
flow-stat(1) command allows you to run broad-scale reports on any combination of flow files. The flow-stat man page lists many supported reports. Several of them are not yet implemented, however, and inform you so only when you try to run them. Others strike me as not entirely useful unless you're running BGP. Some of the report formats I find most useful are:
5--TCP/UDP destination port. This counts traffic in both directions, so you must sort it carefully.
10--Source and destination IP. This lets you sort traffic usage between particular machines, so you can identify connections that hog bandwidth.
11--Source or destination IP. This lets you identify particular machines that use large amounts of traffic.
Indicate the report format with the
-s option sorts the results in ascending order, while
-S sorts in descending order. Both take a single argument, the number of the column to sort on. Flow-stat numbers columns starting at 0.
For example, suppose that you need to identify the most heavily used TCP/IP ports on your network. This is flow-stat report format 5. When running a new report, I run it once without any sorting option, just to see what the columns are, and then run it a second time sorted on the desired column. Here I want to sort on column 1, which is the number of flows.
# flow-cat -p flowfiles | flow-stat -f 5 -S 1 | less # --- ---- ---- Report Information --- --- --- # # Fields: Total # Symbols: Disabled # Sorting: Descending Field 1 # Name: UDP/TCP destination port # # Args: flow-stat -f 5 -S 1 # # # port flows octets packets # 443 191969 959951160 2554217 80 42740 345856613 2384044 25 1022 8152412 14890 53 346 57947 730 445 307 20352 424 135 249 11952 249 110 212 111870 2342 44473 203 247298 757 ...
This report actually goes on for several pages, with a "trailing edge" of ports that have had only a few contacts.
You can learn a lot about my network here. The most popular port is 443, for SSL web traffic. Ports 80 (http) and 25 (smtp) are also popular, as well as 53 (dns). I also get a lot of requests for Microsoft protocol ports, 445 and 135. The big surprise for me is port 110; my network doesn't provide POP3 services! I really need to identify where this traffic is coming from and where it's going. I could use flowdumper for that, or write a more complicated report with flow-nfilter.
As Netflow tracks a single TCP/IP connection as two flows (one from the client to the server, and one from the server to the client), you will see lots of smaller requests to off-numbered ports. One flow's source is its mirror's destination, after all.
Perhaps my FlowScan graph displays heavy traffic usage, and I want to see what's going on. One possibility is that a particular client is making especially heavy demands upon one of our servers. I want to see which combinations of clients and servers are using the most flows, which is flow-stat report format 10. I'm sorting on column 2 (flows), in descending order. (Again, I don't know that it's column 2 I want until I run the report unsorted; there is no magic way to extract the knowledge of which column I want from the ether.)
# flow-cat -p flowfiles | flow-stat -f 10 -S 2 | less # --- ---- ---- Report Information --- --- --- # # Fields: Total # Symbols: Disabled # Sorting: Descending Field 2 # Name: Source/Destination IP # # Args: flow-stat -f 10 -S 2 # # # src IPaddr dst IPaddr flows octets packets # 188.8.131.52 192.168.88.230 5167 15766678 52283 192.168.88.230 184.108.40.206 5123 5200858 34541 220.127.116.11 192.168.88.243 4121 27993332 63995 192.168.88.243 18.104.22.168 4112 30655019 53695 22.214.171.124 192.168.88.243 3071 8296022 23541 192.168.88.243 126.96.36.199 3069 13705493 18890 188.8.131.52 192.168.88.230 1533 4718630 16236 192.168.88.230 184.108.40.206 1521 1326074 11375 ...
The 192.168.88 addresses are local servers, and everything else is remote. Obviously, the 220.127.116.11 host is the single biggest client--the first four lines involve that host! One thing to realize is that it's common for these reports to give pairs of records together, especially if you're sorting by flows. My biggest flow user is 18.104.22.168 sending traffic to 192.168.88.230, and the second flow is 192.168.88.230 sending traffic back to 22.214.171.124. That makes sense--the server is answering the client about as often as the client talks to the server. In this case, though, one client IP appears in the first four records; it's obviously using the resources heavily.
Another common question is "Which server receives the most connections?" A report on single hosts is format 11. Sort this by flows again, which is column 1.
# flow-cat -p flowfiles | flow-stat -f 11 -S 1 | less # --- ---- ---- Report Information --- --- --- # # Fields: Total # Symbols: Disabled # Sorting: Descending Field 1 # Name: Source or Destination IP # # Args: flow-stat -f 11 -S 1 # # # IPaddr flows octets packets # 192.168.88.230 224442 459688424 1878817 192.168.88.243 153038 1453679456 2635365 192.168.88.247 34503 124729216 291507
An interesting point here is that the host that receives the most connections is not the host that receives the most octets of traffic or the greatest number of packets. You might find octets (column 2) or packets (column 3) a more sensible measure for your situation. Do whatever works for you.