You are here: Home Community BreakingPoint Labs Blog

Visualizing the Twitter Social Network

Back in July, BreakingPoint added support for the Twitter API to our Application Simulator. The method I used in developing this AppSim protocol was to reference the API documentation, writeup a small client, and capture network packets. Looking at how API clients and the Twitter servers behave makes developing a realistic simulation of Twitter pretty simple to do. One question we commonly hear, especially concerning our Security and Application Simulator components, is, "How do you verify correctness?" I want to give one specific example of this by talking about a side project I worked on recently.

Once a month, I attend a local meeting of people interested in computer security. The format for the meetings allows anyone to do a talk typically between five and fifteen minutes in length. I always wait until a few days before the meetings to start thinking about prospective topics. Since I'd been working on some Trac tickets related to Twitter, I thought I might do a short topic related to mapping out the Twitter network. My original goal was to find all the cool security celebrities as a way to find all the cool links everyone is talking about.

Twitter is like most social networks, in that it tracks 'friend' relationships between users. I thought it would be interesting to visualize some of the friend/follower relationships on the site. I started to write yet-another-twitter-client. After about 10 minutes, I saw that I was reimplementing the same code I'd already done for AppSim. I decided I might as well leverage the code I'd written for AppSim. This is one way to address the question of application correctness. If I take the code in the BreakingPoint product that implements the Twitter API, and can really communicate with Twitter servers, then I would call our implementation correct. It's also a good way to make sure I don't have any bugs in Twitter, too.

I started implementing a directed graph in ruby. About 15 seconds later I started looking for a library for doing directed graphs. I found RGL, the Ruby Graph Library. There's a ton of usefulness in that library, and I know I'm not scratching the surface of how I could use it. I hadn't even started thinking about visualization yet, but I found that RGL supports Dot files, the input supported by GraphViz. I was further convinced that RGL was going to be great to work with. I had about a day before the meeting, and didn't want to write any layout code. I just wanted something done quickly (and with the least amount of effort on my part).

I wrapped the AppSim code that implements the Friends/Followers Twitter API calls, and started putting the list of Twitter users into a queue. I then dequeue the first user from the front queue, and get her friends and followers, and repeat.

This graph shows the social network from the perspective of the @BreakingPoint Twitter account. This graph adds new nodes in whatever order they are returned in the API calls.

Ok, so there are some users that show prominently, but it's all a jumble and hard to really extract anything useful.

My next method was to order insertion into the queue by the number of follwers each friend had. This technique had an interesting effect; the user I start graphing at (again, @BreakingPoint) remains in the center of the graph. You'll notice that most of the edges are directed out of the @BreakingPoint user. This is due to the fact that I process all the friends first, then all the followers. One attribute of this view is that people we follow that have a large following are prominent. It's a nice side-effect of how GraphViz lays out the graph.

The biggest problem doing this is that you end up with many of the same users populating graphs, even when starting from different initial users. When you order by the number of followers, eventually someone in your network is following @CNN or @the_Onion, and graphs from different runs start looking very similar in terms of what users are prominent. Also, once you hit a user with a follower count in the tens-of-thousands, progress in mapping the network slows as you retrieve followers in batches of 100 at a time, which is a requirement of the API. If anyone in the network you're graphing follows, say, @BarackObama, you should go to your local Alamo Drafthouse and catch a movie.

It became obvious that filtering is key if you want to get any interesting results from the data.

Here is a graph generated by limiting the users included in the graph to those with between one hundred and one thousand followers. I have also modified the graph to show our biggest followers.

An image like this gives you an idea of the most potentially influential followers you have. If you were a marketer like @KyleFlaherty, you could use this information to start trying to influence the biggest influencers that follow you. In the marketing world, word-of-mouth is a goldmine. This seems especially true in social networks where your friends are hand-selected.

This is just a first step into the topic of data visualization for me. I have a feeling I'm going to have to come up with something better than GraphViz for visualization. I've also had a request for making a web application out of this and make it available to a wider audience. I'd like to use the comments to gauge demand for this tool. If it looks like people would find this useful, I might just try and get it cleaned up and make a simple version available.

The number one thing people have said they'd want is interactivity. I think I'll go get working on that. I have tons of features I'd like to see, and I still don't have a talk for the upcoming meeting.

Update: the first thumbnail and linked image were corrected.

Posted by Todd Manning (2008/09/30 18:00:00 GMT+0)

More Musings on Oracle

Hi, I'm Tod Beardsley. You might remember me from other blogs such as DVLabs and Plan B Security, but now I'm here in the StrikeCenter. It's a fun gig that's about as close to "R&D" as I've gotten -- so far, it's almost exactly half research, half development.

For me, most of the research part so far has been figuring out how Oracle authentication works. If you've ever looked at the Oracle dissector for Wireshark, you've noticed it's pretty sparse. Apparently there are about four people outside of Oracle who do any work at all on the wire protocol, and the guy who wrote a custom parser isn't saying much due to "security factors." This is not surprising, because Oracle's authentication sequence takes forever, with a bunch of pre-authenticated data flying around before access is granted. The sequence goes something like this:

(Client) "Hey Oracle, can I see your database?"

(Server) "What?"

"Your database. Give it to me."

"Oh, sure."

"Great. Here's some encrypted and encoded data."

"Cool, I have some too. Here you go."

"Oh, and I'm a Windows PC."

"I'm a Linux server! We have so much in common!"

"Hmmm."

"Yes...."

"Did you want my machine name? Or my process ID's?"

"Yes, that's what I was waiting for. Here's a session key."

"Oh, okay. I'll use that to encrypt my password. By the way, here's a bunch more info about me."

"Your password? Oh yeah, I haven't authenticated you yet. Just a second."

...and so on.

Now, why there's so much traffic between an unauthenticated client and a expensive enterprise-class server is beyond me; Microsoft SQL Server is a very normal and straight-forward exchange of "Access please, here's my username and password." "Sure thing buddy!" But what do I know, I've never written an unbreakable database server, so all this extra cruft must make it extra secure, somehow.

Posted by Tod Beardsley (2008-04-04 14:56:16)

New Apps: TDS, TNS, FIXT, and FIX

Newly implemented for BreakingPoint's Application Simulator are four new protocols, all available as part of StrikePack 24931.

For database application simulations, we've added the TDS (Tabular Data Stream) and TNS (Trasparent Network Substrate) protocols, used by Microsoft SQL Server and Oracle Database respectively. These protocols are used for both database authentication and database query requests and responses. TDS typically runs on port TCP/1433, and TNS runs on TCP/1512.

We've also added support for the Financial Information eXchange (FIX) protocol. FIX 5.0 consists of the FIX application protocol and the FIXT session protocol. The FIX Protocol is a series of messaging specifications for the electronic communication of trade-related messages between financial entities such as banks, broker-dealers, exchanges, industry utilities and associations, institutional investors, and information technology providers.

For the database protocols, BreakingPoint supports the following options:

TDS

  • Login: Username, password, server name, client name
  • Query: Use Database: Database name
  • Query: Select: SELECT modifier, column list, table name, WHERE comparison expression, ORDER BY expression

TNS

  • Login: Database username, database password, server name, database name, server OS, server banner, client username, client machine name, client program path, client program name
  • Query: Select: Column list, table name, WHERE comparison expression, ORDER BY expression

For FIX and FIXT, these configuration options are available:

FIXT

  • Heartbeat: Test request id
  • Test Request: Test request id
  • Resend Request: Begin sequence number, end sequence number
  • Reject: Reference sequence number, reference tag id, reference message type, session reject reason, message text Sequence Reset: Gap fill flag, new sequence number
  • Logout: Text
  • Logon: Heartbeat interval, reset sequence number flag, next expected sequence number, maximum message size, test message indicator, default application version id

FIX

  • Business Message Reject: Referenced sequence number, referenced message type, referenced business reject id, business reject reason, text
  • Network (Counterparty System) Status Request: network request type, network request id
  • Network (Counterparty System) Status Response: network status response type, network request id, network response id, last network response id
  • User Request: User request id, user request type, username, password, new password
  • User Response: User request id, username, user status, user status text

Finally, these new protocols are now incorporated in four new default BreakingPoint superflows:

  • BreakingPoint FIXT Session (FIXT)
  • BreakingPoint FIX Session (FIX and FIXT)
  • BreakingPoint MS-SQL Server (TDS)
  • BreakingPoint Oracle Database (TNS)

Posted by Tod Beardsley (2008-04-02 16:10:30)

SMB/CIFS AppSim Update

With the release of StrikePack 21889, the SMB/CIFS AppSim module has been improved from our first version which was released last week. Among the improvements are a number of customization options which are now exposed to the UI. First, the user can now provide their own custom data file to be used as payload data during file transfers, as well as indicate a file chunk size to be used for each request of a portion of the file being transferred. The user can also now configure session parameters for use such as client and server name, domain name, username, and password. If any of the available customization options are not modified by the user, they are randomized to provide each traffic flow with a unique set of session parameters.

Stay tuned for more upcoming improvements and expansions to the SMB/CIFS AppSim as well as the addition of new protocol modules as we continue to improve our Application Simulator component.

Posted by Dustin D. Trammell (2008-02-14 11:23:21)

SMB/CIFS Application Simulator

In our recent StrikePack 21396, SMB/CIFS application traffic simulation has been improved from standard SMB traffic to a more dynamic AppSim module. With the initial release of the new module two SMB flow scenarios have been included; an SMB NULL Session and an SMB/CIFS Client File Download. The former simulates an SMB NULL client connection to the server's IPC$ share. The latter however simulates an authenticated client connection to a shared resource on the server, performs a directory listing, then retrieves information about and downloads a file found there. With the first release of this AppSim module as delivered by the StrikePack mentioned above, these two traffic flows use fairly static data such as the filenames found in the directory listing of the shared resource and the data contained within the target file, although some customizations are exposed to the UI such as client hostname, server hostname, username, etc. In a forthcoming update however, many of the parameters for the Client File Download scenario will be randomized by default, and many more parameters will be exposed to the UI for customization by the user.

Posted by Dustin D. Trammell (2008-02-07 17:26:34)