BreakingPoint Labs

Testing and Validation of Network Security Devices

While catching up on security news and blogs the other day, I came across a blog post from ICSA Labs entitled "Why a Test Lab Needs to be Wary of Commercial Exploit Packet Captures" and thought that it would be a good conversation starter to inform our readers about how BreakingPoint approaches developing test cases for security device testing, our methodology behind why we develop our test cases the way we do, and the thought processes and conclusions behind those decisions.

First, it's important to note that ICSA's blog post is primarily talking about test tools that replay packet captures as their security tests. While the BreakingPoint devices do provide a packet capture replay component, this component is not what we use for security testing. The BreakingPoint devices provide a dedicated security component that execute packaged attacks targeting individual vulnerabilities that we call "strikes". Strikes are not packet captures, and we'll discuss how strikes operate and the benefits derived from them a little later in this post.

Toward the beginning of their blog post, ICSA wrote the following:

"If ICSA Labs were to use one or more exploit packet captures created elsewhere, then we would be effectively vouching for the quality and accuracy of these packet captures. But that is the problem; we cannot vouch for their quality and accuracy."

This is also one of the primary reasons that we do not use packet captures of attack traffic that we have come across in our research. However, we take it one step further and and don't even use packet captures created in-house. We simply don't use packet captures for security testing at all, which brings me to the first subject I'd like to discuss:

Attack Realism

Let's look at what ICSA has to say on attack realism using third-party packet captures:

"ICSA Labs does not know whether the code for each would-be exploit actually works as expected. Even if it did work, we cannot confirm that the would-be exploit was run against a vulnerable system when the capture was made. And assuming it was a working exploit that was run against a vulnerable system, we do not know whether the attack succeeded when the packet capture was made. Also, information in the commercial tool typically indicates at which vulnerability each exploit packet capture is aimed. But again, a test lab has no reasonable way to confirm that. To use the tool in this way ICSA Labs would have to make many assumptions and essentially trust an entity outside of our control."

The BreakingPoint Labs team builds each strike by hand after performing our own analysis of the vulnerability. We have a high degree of certainty that our attacks are correct because we do this analysis and then we test the strikes afterward when possible against the actual vulnerable target. Then, we use these strikes (not packet captures of them) in testing performed using the BreakingPoint device. There are currently two ways to test using these strikes; passing attack traffic through an intermediary Device Under Test (DUT), and sending attack traffic directly to an endpoint DUT, which I'll cover next.

Attack Simulation

"But what happens if the vendor's IPS proxies traffic or alters the content of traffic as some IPS products do? Keep in mind that this is a replayed packet capture, not a live exploit. If the commercial tool with its packet capture of an exploit is run against an IPS that does one of these things and the IPS fails to block the attack, did the IPS really fail? Remember, the IPS modified the traffic on the line."

This is a valid concern when testing an intermediary DUT, and even more so when you're using static data from packet captures. In this scenario, our strikes act as both the attacker and target, and send the attack traffic from one port on our device, through the DUT, and back to a second port. In this way, it's really an attack "simulation" using real attack traffic because we're essentially sending traffic back to ourselves rather than a real target. Because we know what valid attack traffic looks like for each individual iteration of the strike, we know what data we're sending, and we know what the data should look like when we (the target) receive it, if the DUT modifies the attack traffic in transit we consider the attack blocked as it is no longer the attack traffic that we sent and is invalid.

"One-Arm" Strikes

"If the IPS vendor cannot reproduce the issue reported to them by the test lab, then the test lab should be able to confirm its findings in some way. But minus the real attack and actual vulnerable system, that is either a very tall order or impossible!"

Once again, we're in total agreement here, which is why we use real attacks. To the extent possible, strikes that target servers can be run in "one-arm" mode where rather than passing attack traffic through a DUT and back to ourselves, the traffic is sent to the DUT as the attack's target server. In this mode, strikes can be used to actually trigger vulnerabilities on actual vulnerable systems. This is what test houses that use BreakingPoint devices like NSS do to verify that the test cases they are using are indeed valid, even though they are provided by BreakingPoint, their vendor.

Custom Strikes

What if BreakingPoint doesn't have a strike for the vulnerability you want to test? Or what if, like ICSA, you don't trust third party content at all? Even though BreakingPoint provides you with real attacks packaged as strikes, users can easily develop their own strikes. I won't cover this topic in any detail here, as we've already had a three part series (1, 2, 3) on this subject posted to the blog.

Strike Development Goals

  1. Trigger Just the Vulnerability & Use Unidentifiable Payloads

  2. One of the most frequently raised concerns about our strikes is that they contain no active payloads or executable shellcode. This is by design. Sure, network security devices often have filters for well-known shellcode and common payload encoders, and we have specific strike categories to test those specific cases, however if you are relying on the detection of such by your IPS to protect you from actual vulnerabilities then you have already failed. Most network security devices are reactive in nature, and in order to detect a particular shellcode or payload encoder, it must first be aware of it and/or have a filter for it. We know there are payload encoders and shellcode out there that devices are unaware of, so we simulate this by using completely random data as our payloads. This forces the DUT to identify attacks based on the properties of the vulnerability, not by relying on detecting known shellcode or a decoder stub from an encoded payload. We focus entirely on triggering the vulnerability, not actually exploiting it with an operational payload.

  3. Randomness & Uniqueness on the Wire

  4. "ICSA Labs is unwilling to risk its reputation and the trust of end users through the use of packaged exploit packet captures in its testing. All of the exploit packet captures we use in network IPS testing were captured here in the lab by our experts. And in ALL cases, we are in a position to verify our coverage protection test results by running the real, live attack against the actual vulnerable system."

    The problem with ICSA's approach here is that you're initially still testing with static packet captures. Consider the scenario where you replay your packet capture of a malicious TIFF file traversing the wire. The IPS under test blocks it, and you mark that as a success. How do you know that if some unrelated parts of the TIFF file are modified, that the IPS won't miss it? How do you know that if you add a whole lot of padding or superfluous structure to the file and move the evil from the beginning of the file to beyond the padding, that the IPS won't miss it? If you're initially relying on packet captures of static attack traffic and then only breaking out the real exploits and targets when something seems amiss or a customer questions your tests, you're not being thorough in your testing.

    BreakingPoint's approach to providing these various attack permutations is to identify all of the components of the attack that are absolutely essential for the attack to work and trigger the vulnerability. We identify these values and their upper and lower bound thresholds as well as identify behavioral protocol and process interactions and what combination and permutations of these are valid. We then develop our strikes to randomize these properties as much as possible while still conforming to the identified valid parameters. Further, we randomize as much other data as possible that is not directly related to triggering the vulnerability while still remaining valid for whatever protocol, file format, or other data structure is being used in the attack. All of this context information and the flexibility provided by dynamic test cases such as strikes as opposed to packet captures is the benefit we get from performing the vulnerability analysis ourselves, understanding the operational bounds of the data involved, and developing strikes that launch attacks that actually utilize that knowledge. You can read more about this subject in one of my previous blog posts, File Format Vulnerabilities and Dynamic Exploit Generators.

  5. Evasions

  6. To further the previous point, BreakingPoint can optionally also mutate attack traffic by employing various evasion techniques. When you combine evasion techniques such as IP fragmentation with fragment reordering, using various text encoding methods, and HTTP chunked encoding transmission, among others, with the randomization of the attack traffic that we are already performing as outlined in the previous section, nearly endless permutations of a single attack are dynamically generated which using static packet captures simply can't compete with. Forgive me for quoting a deodorant commercial, but "anything less would be uncivilized." For much more in-depth information on the subject of evasions, please see our recent webcast entitled Harden Security Devices Against Increasingly Sophisticated Evasions or this previous blog post on the subject.

Conclusion

I hope you enjoyed this look into the BreakingPoint strike development and security device testing mindset and found the information both useful and enlightening. Please do follow some of the links above as there is much more information available about the topics discussed.

0 comments
Tags: ddos and botnet simulation // tech talk // custom applications and attacks // ids ips // virus and spam filters // blog post // unified threat management // security updates // firewalls //

Truth in Testing: Syslog AppSim

In today's Truth in Testing article, I'll be discussing one of our Application Simulators (AppSims) which can be used to generate realistic Syslog traffic.  The BSD Syslog Protocol, which is documented in RFC 3164, describes a transport to allow networked systems to send event notification messages across the network to one or more message collectors called syslog servers.  Syslog uses UDP destined for port 514 as it's underlying transport.  It's a fairly simple protocol, but it does have some interesting bits and should make for an easily digestible example of realism in AppSim generated network traffic.

A typical local syslog message that shows up in one or more system logs looks something like this:

Mar  9 10:07:21 millstone dtrammell: This is a test message.

Here you have a time-stamp, a hostname (millstone), an entity logging the message (me), and the actual log message.  This message was generated with the "logger" command via my workstation's shell, however normally you would have an application or process creating log entries rather than users.  Syslog messages that are sent across the network to a syslog server look a bit different as they need to convey additional information from the source system.  RFC 3164 describes the overall message format:

  The full format of a syslog message seen on the wire
  has three discernible parts. The first part is called 
  the PRI, the second part is the HEADER,and the third 
  part is the MSG. The total length of the packet MUST 
  be 1024 bytes or less. There is no minimum length of 
  the syslog message although sending a syslog packet 
  with no contents is worthless and SHOULD NOT be transmitted.

It is recommended that if you have not previously read Sean's blog post about our back-end BlockLib application protocol construction library, you really should go do that now, as much of the following discussion makes extensive use of the data construction concepts described in that post.

These three message parts identified by the RFC represent the highest-layer data Blocks of the syslog message, other than of course the message's root container Block.  Each of these parts of the syslog message have their own formatting and semantics and are thus further comprised of sub-Blocks.  Without going into too much detail, this logical segmentation of the message data in progressively more granular chunks is important because it allows us to place constraints on the data, as granularly as we like, which is how the AppSim achieves generation of both realistic data when used to generate randomized syslog messages, as well as generate useful test cases when used as a fuzzer.

A quick example of where these constraints are useful is in placing a maximum 1024 byte constraint on the root message container block.  As the excerpt from RFC 3164 above indicates, the total length of any syslog message must be 1024 bytes or less; thus, we shouldn't generate a randomized syslog message any longer than that, unless of course we're fuzzing.

The PRI part is the first part of the message and essentially indicates the message's priority.  On the wire, it is three, four, or five bytes long and is formatted as a left angle bracket ('<') character followed by a number, followed by a right angle bracket ('>') character.  These characters must be 7-bit ASCII in an 8-bit field.  The number between the angle brackets represent the Priority value and must be one, two, or three digits and is a product of a function against both the Facility and Severity values.  The Facility describes the type of message whereas the Severity indicates it's importance.  How the Priority value is derived from these two values is described by the RFC as:

   The Priority value is calculated by first multiplying
  the Facility number by 8 and then adding the numerical
value of the Severity. For example, a kernel message
(Facility=0) with a Severity of Emergency (Severity=0)
would have a Priority value of 0. Also, a "local use 4"
message (Facility=20) with a Severity of Notice (Severity=5)
would have a Priority value of 165.

All of these details are important to outline here as it demonstrates just how many constraints can be placed on, and just how complicated it can be to properly generate, a simple three to five byte field.  For randomized data used as the PRI part to be realistic, it must at least be compliant with the specification.  The PRI Block that the AppSim uses to generate this data is thus itself limited to a minimum of 3 bytes, a maximum of 5 bytes, and is restricted to a character set which includes only the two angle brackets and numeric digit ASCII characters.  The PRI block is further comprised of three sub-Blocks, one for the '<' character, one for the Priority value, and a third for the '>' character.  Each of these three sub-Blocks also have their lengths and character sets appropriately constrained.

The Priority Block is a type of Block referred to as an encoding Block, as it performs an operation on it's input or sub-Blocks to generate output that is more than simply a concatenation of it's sub-Blocks.  Thus, the Priority Block makes use of two sub-blocks itself, one for Facility and one for Severity, but these are not actually included in the block tree; they are simply source material for the encoder Block.  The Facility and Severity Blocks are the two most granular Blocks used in the PRI part, and are essentially just copies of the unsigned 8-bit integer primitive Blocks with a constraint on what their randomized values are allowed to be.  The RFC provides a list of designated Facility and Severity values which are used here as the legitimate values constraint for these Blocks.

You may be considering that this is entirely over-engineered for generating a simple syslog message, especially since thus far I have only detailed part one of three which comprise a complete syslog message.  If this Block-tree were used simply for realistic data generation, you'd probably be right.  Our AppSims that are built with BlockLib however can be dual-purposed as fuzzers, and all of this meta-data is extremely important when the AppSim is used to generate fuzzing test cases, the details of which I will cover in a subsequent blog post specifically on the topic of Syslog fuzzing.

The second part of the syslog message is the Header, which consists of a specially-formatted time-stamp, a space character, a hostname, and another space character.  There's not really any fancy formatting or derived values here other than various length constraints; the RFC describes the time-stamp format in a human-readable form and the hostname is simply the name of the host that generated the syslog message, which is notably not necessarily the same host that is currently sending the message over the network.

The final part of the syslog message is the MSG, which consists of a Tag and the actual log message Content.  The Tag is the name of the program or process that generated the message limited to between 1 and 32 bytes long, inclusive.  The Content is a free-form text message, however it is commonly prefixed with the process ID (PID) related to the Tag value in the form of a left bracket ('['), the PID value, a right bracket (']'), and a colon (':').

The end result here is that the Syslog AppSim can generate randomized but still specification compliant syslog messages that appear realistic on the wire:

<19>Aug 05 10:45:50 xtics uIhTvrqbyVYMDFgBt[26]:qmxQ pWZ 3644 33Jz YdA uR H D33Kic yEgw

Of course while it is specification compliant and therefore realistic to a syslog protocol parser, interpreter, or server, it is obviously nonsensical to a human observer.  When testing these types of technologies, generated data similar to that shown above is usually fit for purpose, however in other cases more control and specificity is required.  The granular way in which the Syslog AppSim generated this message provides a number of customization options for the user who would like to fine-tune a completely randomized message like the one shown above, or create their own entirely new syslog messages.  Below are two screen-shots of the Syslog AppSim's "Flow" settings and the Syslog Message action's settings:

Syslog Flow Settings

Syslog Action Settings

As you can see, nearly all aspects of the Syslog message can be customized via these settings, from the log message's source hostname all the way down to the individual facility and severity, which consequently don't even show up in their native form on the wire.  If you were to, for example, want to create a batch of syslog traffic that appeared to be terrorist identification alerts from a hypothetical network monitoring system called "carnivore", you would want to set the Tag setting to "carnivore" and create a few syslog message actions which conveyed benign log entries, an alert log entry, and set all of their facility and severity values appropriately:

<86>Mar 09 15:20:42 lGvbIs carnivore[867]: No one here but us chickens...
<86>Mar 09 15:20:40 204.172.15.189 carnivore[6456]: Situation Normal
<81>Mar 09 15:20:47 KXMknh carnivore[29]: Terrorist Detected!!!
<86>Mar 09 15:20:36 47.127.243.250 carnivore[84]: These aren't the droids you're looking for.

Sample pcap files for further inspection are available for both the randomized syslog messages generated by the Syslog AppSim's message action using default settings, as well as my "carnivore" example using customized settings.

0 comments
Tags: tech talk // blog post //

White Paper: Simulating Distributed Denial-of-Service with BreakingPoint

Today we have released a new white paper that I've been working on entitled "Simulating Distributed Denial-of-Service with BreakingPoint".  This paper describes how to configure your BreakingPoint product's Network Neighborhood to simulate the traffic profile normally associated with a DDoS attack and then outlines a number of DDoS attack scenarios.  I've also provided a link below to a packaged version that includes product test cases to simulate the scenarios described in the paper.

Of the scenarios presented, there are several recent real-world analogies.  For example, the group of HTTP scenarios in the paper are similar in nature to multiple DDoS attacks that were recently launched simultaneously against our very own HD Moore's Metasploit Project website, alongside other information security and hacking related websites. You can read his ongoing commentary from during and after the attacks on the Metasploit Blog, beginning with this post.

The last scenario discussed in the paper is one of my all-time favorite DDoSes from when I was focusing a lot of my research efforts within the scope of VoIP systems and technologies.  I regularly employed the tactic outlined by this scenario to demonstrate how a DDoS attack can effectively fly under most network security devices' radar by avoiding the usual DDoS traffic model and by shifting the target of the attack from the technology itself to elsewhere.  I won't ruin the surprise here, you'll have to download the paper to find out what I'm talking about...

Finally, some of the test cases were created via scripting within the BreakingPoint TCL interface, so the paper also provides an introduction to that topic as well as the TCL scripts themselves.  Todd Manning has recently been blogging here on this topic, the posts for which you can find by browsing this blog using the "tcl" tag.

We invite you to take a look at the paper, which can be found here (PDF).  The package which includes the paper as well as supporting materials such as test cases and TCL scripts can be found here.

0 comments
Tags: ddos and botnet simulation // blog post // voip //

Ruby String Processing Overhead

Just prior to joining BreakingPoint I taught myself enough Ruby to, with considerable help from HD and Matt Miller, implement a proof-of-concept of the research that I was to present at ToorCon 9 in the Metasploit Framework.  Shortly after that I joined BreakingPoint Labs, and was thrown head-first into the world of Ruby.  One thing I noticed early-on was that strings were frequently represented using both single as well as double quotes, without much reason as to why one was chosen over the other.  After the simple tutorials and I began getting into more complex code, I found out that in Ruby, strings that are contained in double-quotes are processed for escape sequences and variable interpolation whereas strings contained in single-quotes are literal.  For example, if you have a variable in ruby, you can have the value interpolated into a string:

irb(main):001:0> name = 'Dustin'
=> "Dustin"
irb(main):002:0> puts "Hi, my name is #{name}.\n" # processed string
Hi, my name is Dustin.
=> nil
irb(main):004:0> puts 'Hi, my name is #{name}.\n' # literal string
Hi, my name is #{name}.\n
=> nil

Being primarily a C programmer, I had a preconceived notion about double versus single quotes, as in C double quotes are used for a string and single quotes are used for a character.  What originally confused me however was that many times I would see strings contained in double-quotes with nothing included in them that would cause interpolation or get interpreted when processed, essentially creating a string literal using the interpolation string construct method.  I asked HD about this, and he went on a rant about how, when migrating Metasploit from perl to Ruby for version 3.0, one of the other Metasploit developers would always complain about his use of double-quotes where not needed and claim it a performance hit.  HD maintained that there was no significant difference in the overhead between the two construct methods, but the other developer's point makes logical sense; if the string doesn't need to be processed for interpolation, it would be expected that fewer instructions would be called to handle that string.  Seeming logical, I made it a point to use single quotes for strings unless I actually needed the string's content to be processed, but lately this question has been nagging at the back of my mind and the scientist in me demanded proof.

To prove whether or not there really was any detectable, and more importantly, significant performance hit, I wrote a little Ruby script to test this.  The script first defines a string constant to use for the tests.  it then builds two test cases using the String object constructor, one for each construct method (literal versus processed), consisting of Ruby code that simply creates a String object using the constant for initialization.  The script uses the better-benchmark wrapper for rsruby (a Ruby interface to R), to measure the amount of time it takes to execute each test case via the eval() function.  This test performs 10 passes of 200,000 instances of each test case and analyzes the timings using the Wilcoxon signed-rank test.  In order to run these tests as accurately as possible I found the most under-utilized and idle system that I could so that there was less external influence on the tests and timings from other processes executing on the same system.  This happened to be a freshly installed Ubuntu 8.04 system with very little running on it.

This script produced completely unexpected results; the literal string initializations were actually slightly less efficient than their processed counterparts by about 2.6%, and R deemed this difference to be statistically significant.

These initial measurements were taken using a string of a constant length, the smallest length possible (one character) in an attempt to directly measure the overhead aggregate of the object constructor itself and the difference in processing overhead between the two different initialization methods.  Due to the initial results actually being the opposite of what I had expected, I also wanted to test if the results changed depending on the length of the initialization string value, so I then wrote a second script. The second script is similar to the first, however instead of using the same one-character initialization string for each test, the initialization string values instead increased in length, between 1 and 2000 bytes in 100 byte increments.

The second script produced slightly more expected results.  Processed strings still had a statistically significant performance benefit up to about 600 byte strings.  The difference was statistically insignificant only at around 600 to 700 bytes, and became significant again in the literal method's favor at somewhere between 700 and 800 bytes and beyond.

The important point to note is that String object constructors initialized with literal strings only seem to provide a performance benefit on longer strings and using processed strings seem to provide the performance benefit when using shorter strings. This means that there is a measurable string length threshold at which using one or the other initialization method becomes statistically significant for your project, and a window of length sizes within which it doesn't really matter all that much which method you use.  Since most programmers are unlikely to have strings of any significant length directly in their code, it would appear that using double-quotes in normal practice would be the preferred method for achieving better performance.

For some further reading on the performance difference between string variable interpolation versus using string append and concatenation methods, I found this blog post to be interesting.

0 comments
Tags: tech talk // blog post //

Automated Protocol Reverse Engineering

At BreakingPoint Labs, we are not only tasked with creating the exceptional content that you've come to know and love for the BreakingPoint Security component, but we're also tasked with creating the content for the AppSim component.  This content takes the form of individual Application Simulators (AppSims), one per application protocol.  These individual AppSims are essentially each their own little sub-component, generating realistic network traffic for the application protocol in question (as well as fuzzed traffic in the near future, stay tuned).  These components generate both the client and server side of the connection, and when played to the wire from the BreakingPoint appliance at really fast speeds, provide the BreakingPoint's load testing payload.

During the development of AppSims we come across the occasional undocumented or proprietary network protocol, usually during our response to supporting specific customer requests, at which point developing the AppSim becomes a little more interesting than the usual routine of poring through the protocol specification coupled with observing real systems using the protocol to determine implementation or platform nuance.  When trying to implement an AppSim for an unknown protocol, a specification to work from is simply a luxury that you don't have.  It's at this time that protocol reverse engineering comes into play.

Protocol Reverse Engineering

Protocol reverse engineering is traditionally a task done by hand, aided by your favorite network analyzer or packet sniffer.  Text-based protocols like HTTP, SIP, and SMTP that are human-readable on the wire don't require much more than the manual methods.  Packet field boundaries and field groupings are generally easily identified by common delimiter sequences such as carriage returns (0x0d), line feeds (0x0a), CRLFs (0x0d0a), white-space characters, colons, semicolons, slashes, backslashes, pairs of parenthesis and brackets, and so forth.  Text protocol packets, generally being human-readable, can usually be reversed to a fairly accurate packet format without much effort.

Once you enter the realm of binary protocols however, this discipline becomes a whole different ball game.  Binary protocols, not being meant to be read by humans but rather exclusively by machines that already know the protocol's packet structure, have none of the grammar and syntax structure of text protocols.  It is due to this lack of bloat that binary protocols are often preferred for systems that require greater throughput and less latency because binary protocol packets are generally much smaller than comparable packets found in text protocols.  Since the reverse engineer has no such clues to identify packet field groupings and individual field boundaries, much more attention must be paid to the overall collection of packets found in a protocol session and the differences between them.  Iterator fields such as sequence numbers can often be given away by their behavior as the suspected field is tracked from packet to subsequent packet.  As it iterates in value, it is easily recognizable, but questions may still arise such as if the session is short and the iterator is low in value and preceded by a number of zero bytes, what size is the actual iterator field?  Is it only the two bytes that are seen incrementing in value, suggesting a 16-bit field, or does it include the preceding two zeroed bytes, making it a 32-bit field?  Unless you have a way to influence this value directly through the software who's behavior employing this protocol is being analyzed, or can cause the software to communicate with much longer sessions so as to observe the iterator value wrap at it's upper bound or continue iterating into it's next preceding zeroed byte, you may never really know.  The two preceding zeroed bytes could very well be a two byte reserved field which is meant to always be zero.  It is these types of questions, among many others, that arise when attempting to reverse engineer a binary protocol.

In my research into this discipline I have come across a number of techniques for automating the task of protocol reverse engineering.  No one solution offers a 'silver bullet' that magically produces a protocol specification of an unknown protocol, but various automated techniques combined with manual processes can come rather close to this lofty goal if employed against a large data set of protocol traffic and with an appropriate amount of pre-processing of that data set.

Protocol Informatics

One tool that I've added to my protocol reversing toolbox is PI, the prototype reference implementation for the Protocol Informatics Project by Marshall Beddoe.  The tool is now a bit dated, but does work well in some situations.  The general idea of Protocol Informatics is to apply bioinformatics algorithms to network traffic.  The algorithms that are used perform sequence alignment on a series of packet samples to better understand the underlying structure, similar to the way relationships between two sequences of genetic information such as DNA or amino acids are.

First, the Smith Waterman algorithm is used to sort packets from the data set into groups of comparable packets based on a similarity score.  Then, the Needleman Wunsch algorithm is applied to the packets within each group to globally align them and attempt to identify packet format structure through static values and differences, as well as variable length fields by where gaps had to be inserted into various packets in order to align them.  You can find the full whitepaper detailing this technique via the Protocol Informatics Project website.

Because this technique essentially relies on identifying similarities and differences between individual packets in a group of similar packets, it works well against small binary protocols with tightly-packet data structures with a small amount of wasted bits, such as ICMP.  It also works fairly well against text-based protocols as there are many instances of static data found within them such as header field names and delimiters.  Where this approach does not work well is against larger binary protocols that have a lot of wasted space in them such as empty fields reserved for future use or large sized fields used for small values, such as a 32-bit integer field used for a boolean value which will only ever be a 0 or 1.  When faced with these large swaths of zeroed bytes, it is extremely difficult to tell where the real field boundaries are.

Protocol DeBugger (PDB)

One tool that I used to find useful when working within another security discipline is the Protocol DeBugger (PDB) by Jeremy Rauch.  PDB operates like the unholy offspring of a network proxy and an application debugger.  By passing network traffic through it like a transparent network proxy you are able to set breakpoints on specific packets or events, break and inspect individual packets, modify them if desired, and then continue to send them on and proxy subsequent packets as desired.

If I recall correctly, PDB also performed some cursory attempts at identifying packet structure by tracking changing values, such as those that appear to be iterating for example.  Unfortunately I haven't used this tool in a number of years and have been so far unsuccessful at getting it built and functioning on a current BSD system (it's integrated with and uses ipfw redirection), so I can't verify that it actually did this.  At any rate, even if it doesn't, it would still be useful for live protocol analysis which cain definitely aid in protocol reverse engineering, so I included it here for completeness.

The primary downside to this tool is that it was originally developed to aid in protocol fuzzing, and as such works on live traffic as it traverses the tool which acts as a transparent proxy.  In this manner it was used to manipulate the packets to perform fuzzing against either side of the connection.  If you're working primarily with packet captures, it takes a bit of extra effort to replay your packet captures through it.

While it seems that this tool is currently unmaintained as I wasn't able to find a current reference URL, I have a copy that I was able to obtain shortly after Jeremy Rauch's presentation on PDB at BlackHat 2006, and you can grab the same version from the web via the ever-so-useful wayback machine.  Note, this is NOT the same as this PDB, which is an entirely different tool.

Discoverer

Finally, one research project that seems promising is Discoverer from Microsoft Research.  The paper linked here claims some improvements over the technique employed by Protocol Informatics, however the paper's author has indicated that there are no plans to publicly release any implementation code, and I haven't personally had the time to attempt an implementation from the details in the white paper.

Application Analysis

There are also a few techniques that attempt to build a protocol specification not by reverse engineering the protocol as it is seen on the wire but by both dynamic and static analysis of the software that constructs and sends that data.  Obviously this involves applying reverse engineering or debugging techniques to the software itself, and as such these techniques may run afoul of your software end-user license agreement or be outright illegal in your country.

Manual Reverse Engineering

At the end of the day you are likely to still be doing a significant amount of reverse engineering manually, however employing one or more of the automated tools and techniques prior to this undertaking can certainly clear away some of the low-hanging fruit and give you some momentum in the correct direction.  Even beginning with a loosely defined packet structure definition is likely better than beginning from scratch with a collection of raw hex dumps of various packets.

1 comments
Tags:

Videos

More >


Interact





LinkedIn

YouTube

Newsletter


Subscribe to BreakingPoint Labs blog by email:

Type in your email, hit submit and quickly verify your address.


Subscribe to our RSS feed