Inside Apple’s Live Event Stream Failure, And Why It Happened: It Wasn’t A Capacity Issue
Apple’s live stream of the unveiling of the iPhone 6 and Watch was a disaster today right from the start, with many users like myself having problems trying to watch the event. While at first I assumed it must be a capacity issue pertaining to Akamai, a deeper look at the code on Apple’s page and some other elements from the event shows that decisions made by Apple pertaining to their website, and problems with how they set up storage on Amazon’s S3 service, contributed the biggest problems to the event.
Looking at the metadata from the event page, you could see that Apple was hosting content from the interactive element on the apple.com event page on Amazon’s S3 cloud storage service. From what I can tell, it looks like Apple set up the content in a single bucket on S3 with little to no cache hit ratio, with poor bucket configuration. Amazon didn’t reply to my request for more info, but it’s clear that Apple didn’t set up their S3 storage correctly, which caused huge performance issues when all the requests hit Amazon’s network in a single location.
As for Akamai’s involvement in the event, they were the only CDN Apple used. Traceroutes from all over the planet (thanks to all who sent them in to me) showed that Apple relied solely on Akamai for the delivery. Without Akamai being able to cache Apple’s webpage, the performance to the videos took a huge hit. If Akamai can’t cache the website at the edge, then all requests have to go back to a central location, which defeats the whole purpose of using Akamai or any other CDNs to begin with. All CDNs architecture is based on being able to cache content, which in this case, Akamai clearly was not able to do. The below chart from third-party web performance provider Cedexis shows Akamai’s availability dropping to 98.5% in Eastern Europe during the event, which isn’t surprising if no caching is being used.
Updated Thursday Sept. 9th: From talking to transit providers & looking at DeepField data, Apple’s live video stream did 6-8Tbps at peak. World Cup peak on Akamai was 6.8Tbps. So the idea that this was a capacity issue isn’t accurate and the event didn’t generate some of the numbers I see people saying, like “hundreds of millions” watching the stream.
Updated Thursday Sept. 9th: While some in the comments section want to argue with me that problems with the Apple.com webpage didn’t impact the video, here is another post from someone who explains, in much better detail than me, many of the problems Apple had with their website, that contributed to the live stream issues. See: Learning from Apple’s livestream perf fiasco