

Having worked with networking gear for many years I thought it was about time to jump in and post something to our blog, and why not start by talking about pcap files. As most of you already know, when testing and providing support of networking products, it is common that you will get a big pcap file. Often the file can be so big that it is at best slow when opening in Wireshark, or at worst it may be impossible. Make no mistake, I am a big fan of Wireshark and can not remember a day here on the job where I didn't use this wonderful tool. But the question is, how do you complete tasks such as "grab some TCP sessions where there is no data from server" if opening a 200MB pcap file crashes Wireshark every time?
No worries, programming to the rescue!
To solve the problem I used Perl (feel free to use your favorite language) to open a pcap file and do some analysis. Let us look at finding sessions where the client sent data but the server didn't send any data in response. To make it easy I've included all the steps I took and, where appropriate, the code. Since the point is to illustrate how to use script language like Perl to do the job, the code is greatly simplified. For the convenience of reader, the complete code is listed at the end.
Step 1. Open the pcap file and put it in binary mode:
$inputFile = $ARGV[0]; #the first command line parameter is the name of the pcap file open(FD, "<$inputFile") || die "failed to open $inputFile $!\n"; A pcap file is a binary file, so open it in binary mode binmode(FD);
Step 2a. Knowing the structure of the pcap file is helpful here: pcap files typically start with 24 byte file header followed by a sequence of packets. The most important thing from the file header is to read the first 4 bytes to find the endianess.
read(FD, $fileHdr, 24); #skip the 24 byte pcap file header checkFileHdr($fileHdr); #the routine is defined later, it checks the file header to find endianness
Step 2b. Each packet consists of 16 bytes of packet header (timestamp, length etc. Be sure not to mistake it with the protocol header) plus packet data. For details of pcap file format, you can read, for example, pcap file format . We process them by first reading the 16 bytes of header and then read the packat data. Note how endianness plays a role here.
while (!eof(FD))
{
read(FD, $pktHdr, 16);
my ($pktSecond,$pktMicroSecond,$capturedPktLen,$actualPktLen)
= unpack($pktHdrFormat, $pktHdr);
#endianness determines how time and packet length are stored.
if (read(FD, $pktBuf, $capturedPktLen) != $capturedPktLen )
{ print "Failed to read pkt data\n"; last;}
$proto = unpack("C", substr($pktBuf, 23,1));
#this is the byte in ip hdr for protocol type, UDP is 17, TCP is 6
if ($proto == 6) #we are only interested in TCP packet for this task
{
processTCPPkt(); #we will explain this function next
}
}
Step 3. In function "processTCPPkt", we need to decide the offset of various fields, such as IP header protocol field, source IP address, destination IP address, source TCP port, destination TCP port and more. How? Once again I rely on Wireshark (with a small pcap file of course). It clearly shows, for example, source IP address is at offset 26 and source TCP port is at offset 34 (assume the packet doesn't have VLAN tags or ip option fields).This function will look at each tcp packet, if it's a TCP SYN packet, we set up a hash for the TCP session using src IP and port, dst IP and port. If it's a data packet, we want to find whether it's from a client or a server by using the hash, we also make sure the TCP session state keeps track of whether there is client data or server data
sub processTCPPkt
{
my $srcIp = substr($pktBuf, 26,4);
my $dstIp = substr($pktBuf, 30,4);
my $srcPort = substr($pktBuf, 34,2);
my $dstPort = substr($pktBuf, 36,2);
my $tcpFlags = unpack("C", substr($pktBuf, 47,1));
if ($tcpFlags == 2) #2 means TCP SYN
{
$hashKey = "$srcIp$srcPort$dstIp$dstPort";
if (! defined $tcpSessionState{$hashKey})
{
$tcpSessionState{$hashKey} = 0;
#note that TCP session is only hashed from client side to server side,
i.e. $srcIp is TCP client sip
}
} elsif ($tcpFlags == 0x18) #TCP PSH ACK pkt, a.k.a data packet
{
$hashKey = "$srcIp$srcPort$dstIp$dstPort";
if (defined $tcpSessionState{$hashKey})
#this pkt is from client, see the above for tcp session hash set up
{
$tcpSessionState{$hashKey} |= 1;
#so if the session has client data, the least significent bit will be 1
} else
{
$hashKey = "$dstIp$dstPort$srcIp$srcPort";
if (defined $tcpSessionState{$hashKey})
#this pkt is from server
{
$tcpSessionState{$hashKey} |= 2;
#if the session has server data, the least significient bit will be 1
}
}
}
}
Step 4. After we go through all the packets, we go through all the TCP sessions and print the sessions that have no server data.
foreach $tcpSession (keys %tcpSessionState)
{
if (($tcpSessionState{$tcpSession} & 2) == 0)
{
printSession($tcpSession);
}
}
We save it to a file, say, “findBadSessions.pl”. Here is part of the sample output for the command:
perl findBadSessions.pl bigPcap.pcap
1.1.17.167:7697 --> 1.2.167.89:80
1.1.172.218:7840 --> 1.2.26.37:80
1.1.121.19:7698 --> 1.2.122.238:80
1.1.127.196:7652 --> 1.2.129.100:80
1.1.172.131:7532 --> 1.2.174.40:80
You can then grab a TCP sessions using tcpdump or windump as: windump -r "bigPcap.pcap" -w output.pcap host 1.1.17.167 and tcp port 7697 and then your output.pcap will contain the TCP session, and it is small enough to be opened by Wireshark.
Since the code is arranged to make it easy to read you may want to format it in your favorite coding style and add more error check if you see fit. Whatever you do let me know in the comments section. You may also be wondering about the speed of using Perl in processing a big pcap file. Yes, a program written in C is faster, but Perl is also fast. On my Windows XP (3.1GH, duo core), I ran this Perl program on a 234MB pcap file with 13655 TCP sessions and it took about 2 seconds.
With this method we can also do the following:
One of the better side benefits of completing this task is when you hear compliments from your colleagues in the form of the question, "How did you find the needle in the haystack?".
For the sake of completeness, here is the entire Perl script:
my $pktHdrFormat;#depending the endianess of file hdr, the 16 byte pkt hdr should be read accordingly
my %tcpSessionState; #this is used to keep track of all the TCP sessions.
$inputFile = $ARGV[0]; #the first command line parameter is the name of the pcap file
open(FD, "<$inputFile") || die "failed to open $inputFile $!\n";
binmode(FD); #pcap file is a binary file, so open it in binary mode
read(FD, $fileHdr, 24); #skip the 24 byte pcap file header
checkFileHdr($fileHdr); #check the file header to find endianness
#now process each packets
my $pktSecond;
my $pktMicroSecond;
my $capturedPktLen;
my $actualPktLen;
while (!eof(FD))
{
read(FD, $pktHdr, 16);
my ($pktSecond,$pktMicroSecond,$capturedPktLen,$actualPktLen) = unpack($pktHdrFormat, $pktHdr);
#print "$pktSecond,$pktMicroSecond,$capturedPktLen,$actualPktLen\n";
if (read(FD, $pktBuf, $capturedPktLen) != $capturedPktLen ) { print "Failed to read pkt data\n"; last;}
$proto = unpack("C", substr($pktBuf, 23,1));
#this is the byte in ip hdr for protocol type, UDP is 17, TCP is 6
if ($proto == 6) #we are only interested in TCP packet for this task
{
processTCPPkt();
}
}
foreach $tcpSession (keys %tcpSessionState)
{
if (($tcpSessionState{$tcpSession} & 2) == 0)
{
printSession($tcpSession);
}
}
#this function will look at tcp packet, if it's a TCP SYN, we set up a hash for the TCP session
#using src IP and port, dst IP and port. If it's data packet, we want to find whether it's from
#client or server by using hash. For data packet from client, we make sure the tcp session state
#keep tracks of whether there is client data or server data
sub processTCPPkt
{
my $srcIp = substr($pktBuf, 26,4);
my $dstIp = substr($pktBuf, 30,4);
my $srcPort = substr($pktBuf, 34,2);
my $dstPort = substr($pktBuf, 36,2);
my $tcpFlags = unpack("C", substr($pktBuf, 47,1));
if ($tcpFlags == 2) #2 means TCP SYN
{
$hashKey = "$srcIp$srcPort$dstIp$dstPort";
if (! defined $tcpSessionState{$hashKey})
{
#print "SYN\n";
$tcpSessionState{$hashKey} = 0;
#note that TCP session is only hashed from client side to server side,
i.e. $srcIp is TCP client sip
}
} elsif ($tcpFlags == 0x18) #TCP PSH ACK pkt, a.k.a data packet
{
$hashKey = "$srcIp$srcPort$dstIp$dstPort";
if (defined $tcpSessionState{$hashKey})
#this pkt is from client, see the above for tcp session hash set up
{
$tcpSessionState{$hashKey} |= 1;
#so if the session has client data, the least significent bit will be 1
} else
{
$hashKey = "$dstIp$dstPort$srcIp$srcPort";
if (defined $tcpSessionState{$hashKey})
#this pkt is from server
{ $tcpSessionState{$hashKey} |= 2;
#if the session has server data, the least significient bit will be 1
}
}
}
}
sub checkFileHdr
{
my $fHdr = shift;
my $signature = unpack("N", substr($fHdr,0,4));
if ($signature == 0xa1b2c3d4)
{
$pktHdrFormat = "NNNN"
} elsif ($signature == 0xd4c3b2a1)
{
$pktHdrFormat = "VVVV";
} else
{
die "unexpected signature bytes";
}
}
sub printSession
{
my $session = shift;
my @B = unpack("C*", $session);
printf "%d.%d.%d.%d:%d --> %d.%d.%d.%d:%d\n",
$B[0], $B[1],$B[2],$B[3], $B[4]*256+$B[5], $B[6],$B[7],$B[8],$B[9], $B[10]*256+$B[11];
}
This morning I found myself very frustrated as I attempted to fill out my expense report. Keeping track of expenses is a painful exercise. You must remember to get receipts, remember to save those receipts, keep those boarding passes and submit the report in a timely manner. Let’s face it. It is an enormous pain. Here is where my American Express card typically saves the day.
I do love my American Express card, it has never let me down and while I do a fairly good job of keeping up with my receipts, sometimes things slip my mind or I lose a receipt. In those cases the first thing I do is login to my American Express account check out all my statements (since I'm so environmentally friendly I've rid myself of those pesky paper statements). It is a great resource to double check that I haven't missed any expenses and it ensures I'm reimbursed for my work-related expenses.
Unfortunately, when I logged on this morning, this is what I saw:

Not only am I unable to access my account, they are unable to write an error message in proper English. Of course, once I saw this message I thought I’d simply hit the trusty back arrow and try again, except this time this is what I got:

“We’re Sorry”? That’s all you have to say about this? And you don't even have the decency to properly format the HTML to display in my browser?
In 2009 American Express did about $24 Billion in revenue with a net income of about $2.1 Billion, yet they haven't figured out whether their network and application infrastructure can handle the load of customers simply trying to check their recent card activity. You could probably argue that it is not a big deal if a handful of their customers cant login to see their statement and pay their bills. But, if their customers are unable to login and pay their bills, maybe those customers forget to “try again sometime” and pay their bill a lot later. If you've ever studied finance, you probably know about the time value of money. All other things being equal, American Express would much rather have my credit card payment today rather than next week, especially considering the financial climate over the last 18 months. Because of their network and application infrastructure failure, they aren't going to get my payment today.
While this in and of itself is a sizable issue, it really is more of a symptom of a larger and potentially far more catastrophic problem. Our security. American Express’ inability to support the required number of users logged into their network indicates a series of problems with their network infrastructure. Certain devices that make up their network infrastructure are unable to handle the load associated with everyday users logging in. It is very likely that the issue is isolated to a very small segment of their network infrastructure, but here is the problem: if one of the elements in their infrastructure is incapable of handling the requisite load, the entire infrastructure suffers.
What my experience this morning has told me is that American Express hasn't validated the resiliency of their network infrastructure, particularly under load. What I'd really like to know is what would happen if American Express was subjected to a larger scale cyber attack. Would our personal information be vulnerable to evildoers?When validating a network or IT infrastructure there is a critical interrelationship between realistic applications, security, performance, currency and concurrency:
The ability to handle all of those factors at the same time is what defines a resilient cyber infrastructure. Here is where American Express has failed. They certainly have tried to validate their network is secure. And it might be. But, clearly they have not validated their infrastructure under stressful performance conditions; otherwise I would have been able to access my account. Thus, they haven't really validated their network is secure. They've missed the concurrency aspect of resiliency.
Naysayers might claim that it is too hard to do all of this validation and be prepared for any conditions. And up until recently, they may have actually been right. But things have changed. One example is our announcement of the BreakingPoint Resiliency Score last week. This concept brings a scientific method to validating cyber infrastructure resiliency and removes the guesswork from the entire process.
Maybe if American Express had validated their cyber infrastructure's resiliency, I'd have been able to submit my expense report.
On the show floor at RSA Conference there is a lot happening and overall the show seems much more well attended than last year. This show, as most of you know, is also a harbinger of news releases and product announcements. Crossbeam, providers of scalable software and hardware platforms, distributed a few pieces of news leading up to the show and at the conference itself. I went over to visit the Crossbeam booth (#545) while at RSA so check out a live demonstration of their X-Series security platform using four BreakingPoint Elite chassis. With this impressive demonstration in the background I talked with Crossbeam's Peter Doggart.
Q. First off Peter, can you provide us with an overview of what Crossbeam provides?
Crossbeam’s X-Series security platform lets customers virtualize third-party, best-in-class security applications and scale them to meet the needs of large, high-performance network environments. Today, more than 900 leading enterprises and service providers, including 10 of the top 11 telecom carriers worldwide, rely on Crossbeam as the underlying architecture for the delivery of security services.
Q. Crossbeam is demonstrating something very interesting here at RSA, can you tell us about what is going on and why?
In working with service providers over the past year, and in particular mobile network operators (MNOs), it has become evident that they are under enormous pressure to meet growing network demands while simultaneously delivering “clean” data pipes.
What we are showing at RSA is proof that our X-Series security platform delivers the world’s fastest firewall performance to meet the needs of mobile operators. Using BreakingPoint Elite, we are conducting a to stress-test the X-Series chassis. We are running a best-in-class application on the X-Series, Check Point Security Gateway R70 Firewall, to clean, inspect and secure the traffic.
This demonstration shows how service providers and mobile carriers can easily scale their network security infrastructure to cope with the next generation of mobile technology, 4G/LTE, under real-world conditions.
Q. You mention “real world” a few times in your answer and in the news release that went out. What does that mean to mobile network operators?
There is a growing gap between what vendors state on their data sheets and what we typically see out in the real world in terms of performance. There are two key elements at play in the real world. First, we are seeing more attacks, which place a greater burden on our security systems and, second, we are seeing smaller payload sizes, especially with the growing number of mobile devices. The result is that mobile operators need to buy and manage a lot more equipment than they budgeted for as the real-world demands are far greater than they ever anticipated. This is not only more costly to them, but it is also a lot more complex to manage.
Realistic tests like this one at RSA validate that we deliver the fastest-performing firewall on the market under real-world conditions which means that we can stand behind our performance claims and mobile network operators can be assured that their X-Series security infrastructure delivers the flexibility, superior performance and high availability required to handle the unpredictability of growing data traffic demands.
Q. How can this type of validation, throughout the industry, not just at Crossbeam, help the overall performance of MNOs?
Crossbeam’s policy is to be transparent when it comes to performance claims. We are doing the opposite of what many vendors do by actually creating tests that provide worst case metrics, not the best case. Take the RSA live demonstration. We are using BreakingPoint to generate 96 byte HTTP packets, which in the real mobile world would be the worst case payload size. At Crossbeam, we want to create some real-world industry guidelines that everyone follows so mobile operators, government and enterprise customers understand exactly what they are buying, and can capacity plan correctly.
Q. I noticed four BreakingPoint chassis in the Crossbeam booth generating the traffic for the demonstration. Why does Crossbeam use BreakingPoint for product validation?
First, we use the BreakingPoint Elite chassis because they can accurately simulate the type of traffic we see in the real world and, second, because BreakingPoint is the only vendor that can push the Crossbeam chassis to its current performance limits.
Q. How has using BreakingPoint helped the evolution of Crossbeam products?
Because BreakingPoint equipment pushes our chassis to its absolute limits, Crossbeam is better able to fine-tune its performance to address customer needs with the assurance that the X-Series can handle their network demands. In the latest release of the X-Series operating system, for instance, we boosted the number of concurrent IP connections we can support up to 10 million, and increased the new connection rates per second to 320,000. These numbers are critical to mobile operators who need to support the growing number of smartphones and other devices, which create more traffic than traditional mobile phones and are nearly always connected. Without BreakingPoint, we couldn’t have confidence in our real-world performance metrics.
A few two weeks ago I sat down and moderated a conversation with BreakingPoint's Dennis Cox and Brent Cook on IPv6. We spent nearly an hour talking about different aspects of IPv6 including removing some of the hype, security concerns and some of the vast benefits that can be had from IPv6. Not surprisingly IPv6 was a big topic at RSA Conference two weeks ago, particularly in conversations I had with people on the floor. In many cases they have the same questions we discussed during the webcast.
I thought it would be helpful to provide the video of the IPv6 conversation with Dennis and Brent. I've put together the video with the slides below, enjoy:
This morning on the floor of RSA Conference BreakingPoint unveiled the BreakingPoint Resiliency Score™, a new approach to objectively measure the resiliency of network and security equipment, putting an end to data sheet speculation. We've all been there, of course, reading a product data sheet that provides data on performance and security of a piece of network or data center equipment. But, we have all reached the point where we basically ignore much of this data.
The reason isn't that the information is fictitious, it is simply not based on real-world scenarios. BreakingPoint is all about real-world simulation, as anyone who reads this blog regularly knows. The BreakingPoint Resiliency Score takes this ability to simulate real-world applications, real-time security strikes and maximum load to provide an objective, repeatable and scientifically measured certification of the performance, security and stability of any network or network device.
The press release that went out this morning had a great quote from BreakingPoint CTO Dennis Cox. It summarizes why this is important for all of us:
“Certification for performance and security is nothing new; in fact, we have come to expect it for everything from our phones to our automobiles. Yet network equipment, which supports our businesses and governments, has no standardized certification for performance and security. Instead we rely on statements made in product marketing literature, which are based on best-case scenarios, not real-world truths. Organizations want measurable answers, not assurances, when it comes to cybersecurity. The BreakingPoint Resiliency Score cuts through all the speculation and confusion and uses a scientific methodology that provides a deterministic and repeatable certification of any vendor claim.”
If you are at RSA Conference stop by booth 1356 and we would gladly show you how Resiliency Score works.
Tags: layer 2-7 // blog post // tech talk //