The Difficulty With Testing Blacklists
I was recently directed to Greg Hoglund's new blog, Fast Horizon. So far the few posts he's made have been excellent reading, and I generally agree with most of what he's been saying over there. Just recently he put up a post entitled "Whitelists are the new snake-oil" where he convincingly outlines both why blacklisting is a failing approach to stopping malware (if it ever really worked in the first place), and why the whitelisting approach that many vendors are moving toward is equally doomed to failure. Blacklisting is the current industry standard approach to stopping malware and is used in technologies such as anti-virus and anti-spyware software. Blacklisting is also the approach that most network security device vendors take to blocking network attacks and exploits. While firewalls are the obvious exception to this rule as they are essentially whitelist based, the vast majority of packets that get blocked, filtered, or identified by other network security devices such as IDS and IPS systems are done so via a match against a blacklist of traffic signatures. These signatures are analogous to fingerprints of the known malicious data that traverses the network. The following quote from Greg's post nicely sums up the difficulties with this approach:
"Blacklisting sounds ideal, but it doesn’t work. New malware emerges daily that has no corresponding blacklist signature. The malware must first be detected, and then processed. There is always a time window where Enterprises have no defense. Recent figures suggest that the AV vendors are falling so far behind this curve that they will never catch up with the deluge of new malware arriving daily. It can take weeks for a signature to become available.
This deluge of new malware is due to several factors. First, there is more money behind malware development than ever before. Second, we weren’t really that good at capturing malware in the past. Today, new malware can be automatically collected, without human intervention. The slow trickle of malware turned into a flood as honeypot technology emerged. Sensor grids can obtain new malware samples with efficiency - they automatically ‘drive by’ (aka spidering) malicious websites to get infections and leave open ports on the ‘Net so automated scanners will exploit them. In parallel to the automated collection efforts, cybercrime has risen to epic levels. Finally, the barrier to entry has dropped for the cyber criminal. Cyber weapon toolkits have become commonly available. Anti-detection technology is standard fare. New variants of a malware program can be auto-generated. A safe bet is to expect thousands of new malware to hit the Internet per day."
What he describes happening in the malware space is also happening within the attack and exploitation landscape. Network attacks and exploits are becoming increasingly more dynamic and harder to fingerprint for a number of reasons. As individual exploits increase the number of targets they can attack, which usually consist of various combinations of target software, the operating system running that software, patch-levels of both the software and operating system, the hardware architecture that both are running on, and any number of other target environmental factors, identifying a single exploit or vulnerability based on it's related network traffic becomes much harder. When you then factor in all of the evasion techniques such as data massaging, reordering, randomizing, encoding, encrypting, tunneling, and any number of others, many of which have become standard in publicly available tools such as the Metasploit Exploitation Framework, the problem gets exponentially harder and you begin to approach the number of potential variants of a single attack or exploit that the malware folks are seeing in their space.
When faced with an ever-changing hostile network environment of malicious traffic, blacklists of signatures and filters that are developed to detect this data must be equally robust, and must be tested and verified with test cases that approach the dynamic nature and variation that you would see in the wild.
For further reading on dynamic security tests, please see my previous post entitled "File Format Vulnerabilities and Dynamic Exploit Generators" from a few weeks ago. It helps make Greg's point in that it illustrates how many variants of a malicious file must be properly detected when even only a handful of fields of data within that file can change and it still be a working file format exploit.
File Format Vulnerabilities and Dynamic Exploit Generators
File format vulnerabilities make for very interesting network security device test cases. Due to their nature and how they're transported within network traffic, there are generally multiple layers of data structure, formatting, encoding, and potential obfuscation in play. Not only must you consider the file format in question's own internal structure, but quite frequently there is further embedded content within that structure. This entire file object, being it's own self-contained entity, is then potentially being sent over any number of various network transports. Because the vulnerabilities usually lie in some anomalous use of the file format's internal structure, a bad value contained within that structure, or embedded content within the file, a network security device that intends to detect this maliciousness not only needs to be able to recognize it wherever in the network packet flow it may be, but also be able to reconstruct potentially fragmented data, decode potentially encoded data, and then identify the malicious bit itself.
To better illustrate this dilemma, let's take a look at the recent ms08-033 AVI/MJPG vulnerability. AVI is essentially a RIFF file with some particular AVI-specific FourCC tagged LISTs and chunks. The vulnerability here lies in the value of the "biHeight" field within the BITMAPINFOHEADER structure, which is essentially the AVI "stream format" ("strf") chunk. This chunk is part of a "strl" LIST that describes an individual video stream contained within the file, which, along with other "strl" LISTs generally follow the main AVI header ("strh") chunk. By populating the "biHeight" field with a negative value (I had success with values -1 through -31), you can trigger the vulnerability described in the advisory and cause Windows Media Player to crash. Windows Media Player also doesn't seem to care too much about what Codec you indicate that the video stream should be processed with ("fccHandler") in the AVISTREAMHEADER. Either Windows Media Player defaults unrecognized handlers to the MJPG Codec, or it determines where to send stream data with an unrecognized handler value based on recognizing the encoding method applied to the stream data itself. Further explanation of the vulnerability can be found in a post by Mark Dowd over at ISS's Frequency X blog.
The strikes that are included in the Security component of the BPS product generally draw upon a back-end code-base that is capable of generating the various bits of data that are used during the strike's attack. For file format vulnerabilities, this back-end is usually a dynamic file generator that can generate randomized, but still valid, files of the type in question. This back-end is then coupled with individual strike descriptions which describe the parts of the attack such as various meta-data, the network connections involved, and what actual data is sent across the wire within those connections. For file format vulnerabilities, the possibilities here are endless. Due to the file being a self-contained unit, it can be sent from attacker to victim over any number of transports such as HTTP, FTP, POP3, SMTP, IMAP, peer-to-peer applications, tunneled protocols, embedded within other files, and so forth. Many of these available transports also provide for multiple types of data encoding during transport, such as SMTP's Base64, Quoted-Printable, and UUEncoded variations.
For the ms08-033 example given above, we have developed 8 different transport-and-encoding-based strikes, each of which plug into the RIFF(AVI/MJPG)-generating back-end to dynamically generate the malicious file being transferred. When you do the math on all of the permutations of the bits of data that a network security device either MUST (to properly detect the badness) or SHOULD (to avoid false positives) be included in a signature or filter for this vulnerability, not including everything else that is randomized by our RIFF-generating back-end, you end up with 1,065,151,889,408 (yes, that's just over one trillion) attack permutations, or malicious, vulnerability-triggering attacks. These attacks can be produced via the eight ms08-033 strikes that will be included in our next StrikePack. Any network security device's filters or signatures must be able to match on all of these permutations, do it reliably, and not false positive on legitimate AVI/MJPG files in order to legitimately claim coverage for this vulnerability. A daunting task indeed.
