Akamai & Limelight Say Testing Methods Not Accurate In Microsoft Research Paper

Last week, I posted about a new technical research paper entitled "Measuring and Evaluating Large-Scale CDNs" that was put out by the Microsoft Research division and the Polytechnic Institute of NYU. The purpose of the study was to conduct extensive and thorough measurements to compare the network performance of Akamai and Limelight Networks.

Some noticed that I did not take any stance either way on the findings of the paper, which was mainly due to the fact that I am not a network engineer and don’t pretend to know everything that is involved in properly testing a network. That being said, both Akamai and Limelight Networks responded to my requests to review the paper and provided me with their comments. Both agreed that there is a lot more to properly testing a network than just the two aspects of CDN performance the paper looked at. Limelight has posted their response to the paper on their blog and Akamai response is as follows.

Based on our internal review of the whitepaper, we believe that there are a number of stated conclusions that are incorrect. These include:

1. Akamai is less available
This conclusion is false. The researchers tested the responsiveness of a single server and group of servers independent of our mapping system – note that this is not the same as measuring the general availability of Akamai’s content delivery services. Because Akamai’s software algorithms will not direct traffic to unresponsive machines or locations, all their conclusion really points out is that a portion of our network is not in use at any time. (This may be due to hardware failure, network problems, or software updates.) We believe that any measurement of availability must take into account Akamai’s load balancing, and that if specific IPs are being tested, then the researchers are not doing so.

2. Akamai is harder to maintain
This conclusion is also false. While Akamai has more locations, and more machines, the power of the distributed model with automatic fault detection means that Akamai does not have to keep every machine or location up and running at all times.  It is incorrect to infer from the fact that some servers are down that Akamai’s maintenance costs are higher.

3. With marginal additional deployments, Limelight could approximate Akamai’s performance
We believe that this conclusion is also false. In our opinion, the research team’s performance testing methodology likely overstates Akamai’s latency numbers. This is because any of our server deployments in smaller ISPs that do not have an open-resolver nameserver would have been missed in their discovery process. It is important to note that these are also the locations where we get closest to the end users.  If those locations were discovered by their research, we would expect the average latency numbers derived from the measurements to be lower. If they are missing some of our lowest latency deployments, then naturally the average, median, 90th and 95th percentiles will change for the better. Because these deployments are the best examples of our "deploy close to the end user" strategy, missing them affects our results more than it would Limelight’s. The networks most likely missed are either smaller local ISPs in the U.S. and EU, or providers in specific countries. These are exactly the places where we’d expect Akamai to have very low latency, but Limelight to have higher latency (especially in Asia, etc.) As such, we believe that the research team’s measurement method ultimately under-represents our country, "cluster", and server counts because they missed counting these more local deployments that do not have open-resolver nameservers.

4. After testing akamaiedge.net, they concluded that Akamai uses virtualization technology to provide customers with isolated environments (for dynamic content distribution)
This conclusion is false. The akamaiedge.net domain is used for Akamai’s secure content delivery (SSL) service, used by WAA and DSA. While these services do accelerate dynamic content, Akamai is not using virtualization technology to provide customers with isolated environments – ultimately, the research team reached an incorrect conclusion after observing how we handle hostname to IP mapping for secure content. The measurements done also concluded that akamaiedge.net servers were in a subset of locations as compared to the larger Akamai network – this is correct, as our SSL servers are hosted
in extremely secure locations.

Furthermore, while the akamaiedge.net network is in fewer locations than the akamai.net network, it is still in more locations that Limelight’s entire network. In addition, the measurements done for this network also under-counted the number of servers and locations. Finally, the whitepaper did not provide figures on CDN delay for this network, only DNS delay.

It is important to reinforce that the "per server" and "per cluster" uptime and availability measurements in the whitepaper that show Limelight as more "available" bypassed Akamai’s mapping system. As such, even if our mapping system never would have sent traffic to a location, they are counting us as unavailable. 

Having a more distributed model (as Akamai does) de-emphasizes the importance of any one location, so much so that we can have entire locations down without impacting performance. Similarly, the researchers don’t sufficiently consider the penalty associated with an unavailable Limelight cluster. One down location in Japan, when it is the only region in Japan, would ultimately have a much greater performance impact than having one of 20 locations in Japan become unavailable.

Additionally, it is also important to reinforce that the research performed did not measure general performance of Akamai’s services (as we would do for a customer trial), but rather DNS lookup delays, and the delay to reach the server selected by Akamai’s mapping system – these are only two components of a full performance measurement. By unintentionally filtering out many of the best examples of our "deploy close to end user" strategy, the research team has grossly misrepresented our availability numbers and also over-estimated our latency.