True "State of the Internet" Report Impossible to Create?

Last week many blogs, including this one, provided recaps of Akamai’s State of the Internet report. Many sites reported on the findings, several focusing on where the United States ranked in “fast” broadband speeds. My initial thought went directly to equipment performance possibilities of course, but the post also sparked a conversation with many folks about the validity of the report itself and the way in which the data is represented. During my discussions with folks in a variety of industries it became readily clear that there is a high level of frustration with Internet statistical reports that claim, or at least are covered by the media/blogs, as comprehensive.

Three main worries came up in my conversations:

  1. The way in which the data was measured around broadband speeds.
  2. Geographical discrepancies not represented or being taken into account.
  3. Overall volume of data used for analysis.

Johan Terve of Aptilo, a WiMax and WiFi management company based in Sweden, kicked us off on the first topic, the measuring sticks used when analyzing the broadband speeds. The belief of Mr. Terve and others I talked with was that the "grey areas" within the study are too vast, including simply looking at connection speeds of less than 256 Kbps and more than 2 Mbps. Terve pointed out:

“It is equally true that there is a huge grey zone in the range between 2 Mbps to maybe 100 Mbps. Some 64% was above 2 Mbps, but there is no visibility on people having connections of 8, 24 or 100 Mbps.”

More clarity into these numbers might reveal that of the 64% nearly half were connected at 24 Mbps or faster, or theoretically you could have had all 64% connecting at greater than 100 Mbps. The point being that it is important when we start to list countries and cities in terms of broadband speeds we eliminate as much grey as possible to get a representative analysis. The same can be said for when we look at different geographies, whether country-wide or city-wide.

The infrastructure differences and geographic discrepancies between the U.S., Asia and Europe are vast, but are not factored into this report, or many others. Over at Nominum, a global provider of ENUM-based IP-application routing directory, DNS, and DHCP solutions, Bruce Van Nice was thinking about population densities. Mr. Van Nice brought up an important point:

“Although we have big cities (in the U.S.) the population densities are still less. Go to any city in Asia and it is densely packed high rises. Networking a high-rise is cheap and easy. Everyone can get 100 Mbps or even a Gbps easy, so this skews the numbers badly. It would be interesting to look at end-to-end performance numbers too, to see if proportionally the access bandwidth holds up across the network.”

Wiring Hong Kong for high-speed broadband is easier than wiring a major city in the U.S. with less high-rise living and is spread out through different neighborhoods. At the same time the wiring of rural areas is even more of a challenge and for countries that have high population density in major cities the numbers, as Mr. Van Nice pointed out, will be skewed. Numbers and location of data points are important for a study of Internet bandwidth or Internet usage, including the overall sampling of data.

Ipoque, a provider of deep packet inspection (DPI) solutions for Internet traffic management and analysis, recently put out their own study of Internet traffic using 1.3 petabytes of traffic. Their study focused more on the applications being used across the Internet (P2P, file hosting, etc), but it brought me to ask how much traffic is enough for a proper study? Talking over email with Ipoque CEO Klaus Mochalski, he pointed out that they don't claim their survey as a comprehensive "state of the Internet" and they are very specific that the amount of data they use is best described as a "snapshot" (emphasis is mine):

“The time period of the measurements was shortened from an average of four weeks to two weeks. While two weeks are long enough to get a snapshot of the user behavior, even four weeks would have been too short to discern broader trends. So this has no impact on the validity of the results, but reduces the data collection and analysis effort. This means: to provide a representative survey we would not only have to gather data at minimum for a year but we would have to have many more measurement points around the world - apart from many other things, that would have to be organized. This is nearly impossible. The phrase "state of the Internet" is just a figure of speech we used for the press release. We openly talk about the things we can't do and what is missing in the survey.”

Let me repeat one part I found really fascinating, "This is nearly impossible". We can not simply look at quarterly reports and discern a true "State of the Internet". This is not a slam against Akamai, they are providing the public with free data that is helpful and have been doing it now for what looks like three quarters. As we approach a full year of data I would imagine Akamai will be able to take a deeper look at all of this information, or perhaps look to an independent research group to review the information and perform in-depth analysis. Until that moment however we must continue to look at a variety of reports and information in order to properly represent the actual state of the Internet.

This last point became abundantly clear during my interviews when Ari Herzog and Andy Krzmarzick brought to my attention a presentation they recently made detailing the use of technology within the government. The presentation, given at the Advanced Learning Institute's 'Social Media for Government' conference, has many slides that offer a different view of broadband impact in the United States and throughout the world. This presentation, to be comprehensive, gathered data points from dozens of sources for a more realistic state of the Internet.

Most likely we will never see one comprehensive report with all the statistical analysis one might want on Internet usage, however it remains in our power, as we discuss these reports and statistics, to gather all the data and opinions available in order to formulate the full story.

1 comments
Tags: Deep Packet Inspection // Network Traffic Generation // Performance Measurement // Server Load Testing //

Comments

Andrew Krzmarzick

State of the Internet

Thanks for the mention in this post. Please keep me aware of developments as you think through these issues. While Ari and I have been focused on the government, there are obvious points of convergence where our work informs one another's explorations of the best way to "measure" interactions and activity on the Internet.

April 20, 2009, 8:29 AM
Post a Comment
  1. Leave this field empty

Required Field

Videos

More >


Interact





LinkedIn

YouTube

Newsletter


Subscribe to BreakingPoint Labs blog by email:

Type in your email, hit submit and quickly verify your address.