Putting Top Network Gear to the Test: A Step-by-Step Guide to Device Evaluations
by Chris MooreWe’ve written a lot about device evaluations lately, and with good reason: conducting bakeoffs for individual products is the first step toward building your overall IT resiliency. BreakingPoint engineers have developed best practices for bakeoffs through many client engagements in the field — practices that are covered in detail in our new Six-Step Plan for Competitive Device Evaluations.
Test Network Devices Using Repeatable, Quantitative Principles
A good bakeoff is scientific. Since you want to make accurate comparisons among devices from competing vendors, you need to create controlled conditions that allow you to mimic the real world, but in a way that you can repeat over and over with precise variations for each device. That means generating a mix of application, attack, and malformed traffic under load, and being very careful to record the quantitative results of each test on each device. Only by embracing such a scientific methodology can you accurately validate the capabilities of today’s deep packet inspection (DPI)-enabled, application-aware devices.
As you stress each device under test (DUT), make sure to incorporate these elements:
- Controlled Variables — The goal should always be to isolate device capabilities and problem areas, which — as in any scientific investigation — requires repeating tests exactly and changing only one input at a time.
- Accurate Baselines — As a further control, each test process should include an initial run through a simple piece of fiber or copper cable to establish what the traffic looks like without intermediation by any device. Doing so creates a valid basis for comparison against the subsequent run of the same traffic through the DUT.
- Uniform Configurations — Consistent collection of data requires that devices be set up uniformly throughout the bakeoff. This means that each vendor should configure its device to match standard settings established by the purchaser, and that settings for the testing equipment should be maintained across all DUTs.
- Escalating Complexity — To achieve a comprehensive understanding of device capabilities, bakeoff testing should proceed by stages, evolving in complexity until it fully reflects the purchaser’s unique mix of traffic.
- Precision Tools — Bakeoffs must use testing tools that create precise real-world network conditions again and again and enable variables to be changed one at a time. These tools must also capture exact measurements of device behavior to enable accurate comparisons among devices.
Execute a Testing Progression That Includes Heavy Load, Application Traffic, Security Attacks, and Other Stress Vectors
Now we come to the bakeoff process itself. During the bakeoff, the wisdom of a progressive, scientific approach to testing will become clear. Following the principles laid out above will reveal the specific strengths and weaknesses of each competing product, replacing guesswork and uncertainty with verifiable results.
In the real world, a device must deal with application traffic, heavy user load, security attacks, and malformed traffic all at once. That is why the “final exam” in this progression will bring together all those elements. But to develop a proper understanding of how a DUT handles specific types of stress, its ability to handle load and attacks will be tested separately first. Subsequent processes will combine validation in the face of load, security, and other stress vectors. During the bakeoff, customers should archive all these tests so that they can be repeated exactly on other devices, or on the same devices once they have been installed in the production environment. Let’s look at each separately:
Load
A device’s specialized capabilities — to block malicious traffic, detect trigger keywords, and so on — are not meaningful unless they perform adequately under heavy load. The processes described in this section ensure that the device being evaluated can easily handle the load it will face, in terms of both sessions and application throughput. If the device cannot pass these tests with traffic known to be free of attacks, there is no way it will process enough traffic once its security features are turned on or when it must also handle other stress vectors such as malformed traffic.
Sessions
This set of tests uses TCP traffic to validate the DUT’s ability to (1) create and tear down TCP sessions at a prescribed rate and (2) handle a prescribed maximum number of concurrent sessions. Each of these tests can be run in “stairstep” fashion, ramping up the degree of stress by steady increments until the device fails. This will determine whether the device achieves its advertised limits and how much headroom it has to handle peak traffic.
Application Traffic
These tests determine a device’s ability to handle real stateful application traffic at high levels of load. BreakingPoint, for example, offers a standard Enterprise application traffic mix that includes more than a dozen of the protocols most commonly found traversing Global 2000 corporate networks. That mix can then be customized by changing the weighting of various protocols or by adding other protocols that better reflect the customer’s unique network environment.
To get an idea of how these processes will affect different DUTs, take a look at the comparative results of a load test performed on each of two firewalls by the BreakingPoint CTM. The first firewall was able to handle the traffic until transactions per second exceeded 12,000:
The second firewall was not able to handle the traffic consistently above 9,000 transactions per second, and more than 5% of the transactions failed:
With those examples in mind, let’s return to the bakeoff. The session and application traffic processes should all be run three times. The first pass is a baseline run, using only a piece of cable and no DUT. The second pass is performed with the DUT in place but with no security or inspection policies turned on. This should result in the purest measure of the DUT’s maximum ability to relay traffic. The third pass is performed with the device’s default security or inspection policies turned on. Since the device will be handling traffic that includes no attacks, evasions, or malformed packets, the policies should yield no positive results. But running this process will indicate the basic impact on performance that comes from having the target device’s application-aware features engaged.
Security
Having probed the DUT’s ability to handle load without the complications of security attacks, it is time to try the opposite case: security without load. A firewall, IPS, or unified threat management (UTM) device will never be better at blocking attacks than when it has no background traffic to contend with, so this portion of the testing will reveal how a DUT’s security features perform under ideal conditions.
Keeping the device’s default security policies in place, run a standard list of security attacks to see how well the DUT catches known malicious traffic. Then, customize the tests in two ways: (1) by tailoring the strike list to exercise particular security policies within the device and then (2) by tailoring the device’s security policies to handle particular strikes relevant to your network environment. As with all of the other processes in the bakeoff, these variables should be changed one at a time so that each test run can be used to isolate particular device capabilities and problem areas.
Besides establishing the basic security capabilities of a firewall, IPS, or UTM, the customization in this portion of the bakeoff will also give you an idea of what level of support you can expect from a manufacturer. Vendors will likely never be more responsive than when they are trying to close a sale, so the customer support during this phase should be excellent.
Combining Load and Security
This phase of the bakeoff combines the ultimate tests from the preceding Load and Security sections. While this does not complete the range of authentic conditions that will be included in the next testing phase, bringing these two validation processes together may be a watershed for some devices that simply cannot handle the combination of load and security attacks.
All Stress Vectors
The layering process concludes by adding other stress vectors that the DUT will encounter in a production environment.
Malformed Traffic
This traffic can appear maliciously, or simply from device malfunction. Either way, malformed traffic is a fact of life on every network and must be included in the bakeoff plan. This portion of the bakeoff progressively determines how a DUT responds to malformed traffic, including frame impairments at Layer 2, session fuzzing at Layers 3 and 4, and application fuzzing at Layer 7.
Evasions
At a minimum, this part of the bakeoff should include TCP segmentation and IP fragmentation evasions. Depending on your particular network conditions, custom lists of evasions can be included as well.
Adding these stress vectors to load and attacks completes the picture. Performing a bakeoff in this way ensures that the device being considered can cope with the entire set of challenges it will face when deployed in the real world. By performing such rigorous testing before devices are ever purchased, you will be sure to buy the right equipment for your IT infrastructure, and you will reduce the risk of running into any surprises once that equipment is deployed in production.
Next Steps
There’s more to the story than this, of course. For a full treatment of bakeoffs, including how they fit into IT planning processes and device life cycle management, download our Six-Step Plan for Competitive Device Evaluations. And when you need to tap our expertise in performing bakeoffs, be sure to check out the BreakingPoint Device Evaluation Service.



