Thursday, September 26th Webinar: Investing in Your Enterprise Video Strategy

Thursday, September 26th at 11am ET, Frost & Sullivan will be hosting a webinar entitled “Investing in Your Enterprise Video Strategy – Don’t Get Left Behind“. Hear the best practices on how to articulate the value of the investment to gain internal agreement and how enterprise video can deliver real business results and complement the digital transformation agenda. Learn about:

  • The opportunity cost of not actively developing and investing in an enterprise video strategy
  • The return on investment and total economic impact of video in the enterprise
  • Top-line benefits and bottom-line efficiencies companies are unlocking through their deployments
  • The role of video in enabling your company’s digital transformation initiative
  • The pitfalls of further delaying the execution of a comprehensive video strategy

Register for free and hear from two Frost & Sullivan analysts on this important topic.

CenturyLink Acquires Streamroot, Will Use Mesh Technology to Extend CDN Capacity Globally

Last week, CenturyLink announced they had acquired privately held Streamroot, a CDN provider with an underlying P2P technology. Streamroot had raised $6M to date and the company is expected to do sub-$5M in revenue for this year. Based on what Streamroot’s valuation would have been for a Series A round, CenturyLink valued the company between $20M-$30M.

CenturyLink’s acquisition is motivated by improving the performance for viewers of live events and day/date content releases to reduce buffering and failed starts during peak hours and catch up TV. Streamroot leverages a software-based mesh network and a deterministic data science methodology, providing device awareness to ensure that consumers have only optimal experiences without creating any data privacy concerns. In short, this extends the capabilities of CenturyLink’s existing CDN network to offer the promise of better performance in hard-to-reach places and during volume spikes when delivery is most difficult. Of course, we have heard these value propositions before, however, the vision that CenturyLink and Streamroot share and what’s under the hood is admittedly more nuanced.

While most previous P2P attempts have failed at scale, Streamroot has actually delivered premium content with million+ simultaneous viewers with national broadcasters like TF1, Canal Plus, and RTVE in Europe and LATAM during the World Cup, Copa América, and others. To date, most vendors that have focused on P2P have tended to roll out “cheap” CDN alternatives, but Streamroot has been quick to identify that “cheap” wasn’t what the market wanted. They had to deliver visibility to material QoS improvements while handling complex workflows including proprietary ABR, multiple SSAI providers and DRM considerations.

CenturyLink said what makes them combination of the companies stand apart, is in their belief in using data from the device to improve content delivery at both a micro and a macro level. At a micro level, in the ability of Streamroot products to take into account each device’s instantaneous conditions to select the best delivery source; on the macro level, in the data gathered on the network topology of ISPs globally, of device behavior depending on OS, version, content type, etc. In other words, this isn’t just “peer-to-peer.” It’s about device telemetry; it’s about adapting delivery to each individual user and content and network; it’s about broadcasters being able to customize a technology solution to their unique use case and all the variables that come along with it: encoding ladder, ABR, ISP peering, networks, topology, and more.

Thinking holistically about delivery has allowed Streamroot and CenturyLink to find a common avenue to improve content delivery. And this offers applications far beyond just P2P for video. Client-side load balancing, file download, ISP-aware routing and devices localizing to the best possible topological and performance based cloud edge serving locations, could all be use cases. They have the potential of using device-side technology to provide the best user experience, helping ISPs route traffic flows more efficiently and helping content providers, web applications and edge cloud services localize connections.

Client-side integration is never a pleasant subject but we might as well take it as a given in a multi-CDN, multi-OS, multi-hardware world. With Streamroot, the company says pre-integrations on pretty much every major web player make web deployment possible in a matter of minutes. Streamroot has also worked to make its mobile SDKs light years simpler than the average advertising – whether SSAI or CSAI – or DRM deployment. It essentially boils down to roadmap prioritization, as the actual man-hours to integrate the solution on any device can be counted on two hands. And when benchmarking that again the quality improvement, the company says the ROI calculation is pretty simple.

I’m very interested in seeing where CenturyLink is going with this acquisition and with its media delivery services in the short term. The CenturyLink team is making smart moves to incorporate technologies that may very well redefine content localization in an otherwise lackluster market, especially in Europe. Right now the Streamroot technology works just for video content, both live and on-demand, but CenturyLink said they plan to invest in the technology, add more developers, and add download functionality in the future. It’s a good move by their part to pick up some proven technology and a good engineering team in Streamroot, and overlay it on top of their network, supported by CenturyLink’s sales and marketing team.

How CDN Switching Blind Spots Lead To Rebuffering

Reducing video rebuffering can be difficult. One solution that many people are talking about these days is moving to a multi-CDN architecture, a topic I’ve written a lot about. But will going multi-CDN magically reduce your rebuffering and drive away all of your streaming ills? The answer, of course, is complicated. Going multi-CDN can provide several benefits for customers, such as better geographic coverage and, possibly, better economics. Adding live switching logic between the CDNs goes a step further and enables load balancing and redundancy in case of problems. But what are the problems that customers are likely to encounter? Let’s examine some common CDN problems and their impact.

Catastrophe and Chaos
Once in a blue moon, a CDN will experience a major outage that affects a large geographic area. These outages are so extreme that they shut down a large portion of the Internet for a non-trivial amount of time. Recent examples of such outages:

Detectability: Very easy to detect as any metric that you care to measure will explode. Your alerts will fire or, if you don’t have alerts, you’ll get messages and phone calls from users.
Solution: A good CDN switching engine will attempt to re-route users to a working CDN within the limits of the outage.

PoP Flop
Occasionally, a CDN might experience a local issue in one of its PoPs (Point-of-Presence), which means all of the users that are routed through that specific PoP will have problems fetching video segments, and will most likely experience rebuffering.

Detectability: Medium or hard depending on the percentage of users that are affected. A large portion of the traffic would skew the metrics enough to create an anomaly, while a small portion might get swallowed within the geographic granularity of the monitoring system.

Solution: A good CDN will re-route the impacted traffic to a different PoP within its network which will cover for the faulty PoP with best-effort performance. A good CDN switching engine will eliminate the faulty CDN altogether from that region and just use a non-faulty one.

Smaller scale outages are so hard to detect that you might never know they ever happened. Have your users experienced such an outage? The answer is most likely yes, since all CDNs see dozens of them as part of their daily monitoring efforts. In the industry, we call these events “blind spots”.

The Blind Spots of CDN Switching
There are 3 main blind spots that server-side CDN switching engines do not address very well:

  • #1. The DNS Propagation Problem. “We already know there’s a problem, but we have to wait at least 5-10 minutes for DNS to propagate.” A common CDN switching implementation is based around DNS resolving. The DNS resolver incorporates a switching logic that responds with the best CDN at that given moment. If one of the CDNs in the portfolio experiences an outage or degradation, the DNS resolver will start responding with a different (healthy) CDN for the affected region. The blind spot of DNS would be its propagation time. From the moment the switching logic decides to change CDNs, it might take several minutes (or longer) until the majority of traffic is actually transitioned. Moreover, while most ISPs will obey the TTL (DNS response lifetime) defined by the DNS resolver, some will not, causing the faulty CDN to remain the assigned CDN for the users behind that ISP. Rebuffering on existing sessions is inevitable, at least until the DNS TTL expires on the user’s browser.
  • #2. The Data Problem. “We select CDNs based on a synthetic test file, but real video delivery is much slower.” Any switching solution must implement a data feed that reflects the performance of the CDNs in different regions and from different ISPs. A common approach for gathering such data is to use test objects that are stored on all of the CDNs in the portfolio. The test objects are downloaded to users’ browsers, which then report back the performance that was observed. Often times, for various reasons, the test objects don’t represent the actual performance of the video resources. For example, imagine that the connection between the origin and the edge server is congested – the test object will not be impacted since it’s already warmed in the cache of the edge server and does not need to use the congested middle mile connection to the origin. This performance gap will cause a CDN to be erroneously selected as the best one even though the reality differs. It’s also possible that the test objects and the actual video resources do not share the same CDN bucket configuration. If the video resources bucket is misconfigured, some users might get unoptimized or even faulty responses which at no point will get detected because the test objects bucket functions properly. This kind of disconnect between synthetic performance measurements and actual delivery performance often generate degraded performance that gets undetected for a very long time.
  • #3. The Granularity Problem. “We select CDNs based on overall performance in each region, but this stream is performing poorly for a subset of the users.” A typical CDN Switching flow might be:
    • measure CDN performance across different regions
    • report results to the server
    • server chooses the “best” CDN
    • users are assigned to the best CDN for their region

    Unfortunately, not all regions have fresh performance data all the time and so a fallback logic is usually applied. When there isn’t enough data in a specific region, data from its greater containing region will be used instead. It’s possible that a region with small amount of users gets swallowed up by a larger fallback region, in which case an outage might not be detected at all because the affected users comprise a small portion of total traffic that is not enough to “move the needle”.

    This is a data granularity problem. A broadcaster might have 100k users that are spread across 15 countries, 1,000 unique regions, 5,000 ISPs and a host of other parameters. Taken together, these parameters segment the user base into millions of tiny dimensions, none of which will have enough data to perform meaningful switching decisions, not to mention the load it will create on the switching system. For this reason, server-side switching is inherently limited to a more coarse grouping that is technically and mathematically viable. This reality creates a blind spot when it comes to smaller regions that might get hit by a local, undetectable outage.

Real World Example of The Granularity Problem
In the last week of August, an outage occurred in the U.S. which demonstrates the granularity problem. An outage by a CDN I won’t name, caused a significant drop in request performance that, in turn, led to rebuffering. Thanks to Peer5 for sharing the screen grabs (below) from their monitoring tool showing that at 7:20 AM, an increase in Time-To-First-Byte(TTFB) was observed from an average of 850ms to a peak of 6700ms. When comparing the 95th percentile TTFB of the affected area to its greater containing region, it’s clear that the affected area didn’t constitute enough data to move the overall metric.

95th Percentile TTFB – Affected area vs Greater Region

The greater containing region (blue) doesn’t show any anomalies throughout the outage. (less is better)

Rebuffer Time as %

Rebuffering spikes render the playback unwatchable. (less is better)

Allocated CDNs Over time

While this chart might seem dull, it illustrates the fact that throughout the entirety of the outage, no CDN switching took place for the given region.

Enter: Per User CDN Switching
Video playback is a very fragile thing. A user might have just a couple of seconds of content buffered ahead and any slowdown in fetching segments can easily consume that buffer and freeze the playback. For this reason, vendors in the market are coming up with ways to fix the problem. For instance, Peer5 created a client-side switching feature which constantly monitors the playback experience for each individual user and is able to react to poorly performing CDNs within a split second (literally, milliseconds) and prevent rebuffering from ever happening. This means that even an outage that only affects one user will be accounted for. The below charts shows the performance during the outage described above with and without a client-side switching feature.

95th Percentile TTFB – Affected area vs Greater Region

The TTFB of the client-side switching group (green) was affected as well but much less than the other group. (less is better)

Rebuffer Time as %

The client-side switching group (green) experiences almost no interruption in playback. (less is better)

As seen in the graph above, users that relied solely on server-side switching (red line) were impacted significantly, compared to users with client side switching. Server-side CDN switching was not granular enough to detect the local outage and the assigned CDN for that region remained the same even though some users experienced terrible performance degradation. The client-side switching, with its per-user granularity, was able to change the mixture of CDNs within the region and avoid the issue in real-time. The rebuffering was reduced from 11.2% to 0.2% for client-side switching enabled users, and the overall region rebuffering was reduced by 70% from 1% to 0.3%.

Summary
When CDNs experience outages, users will encounter rebuffering. There are multiple types of outages, some will go below the radar completely undetected while some will make you notice them immediately. Different layers of redundancy and different levels of granularity tries to address the various outages an online delivery pipeline might experience. A combination of several such redundancy tools is likely to achieve the best UX. Employing server-side switching alongside client-side switching allows customers to:

  • Reduce rebuffering by monitoring video playback constantly for all users
  • Allow existing sessions to respond to outages very QUICKLY by switching CDNs on a per request level
  • Improve bitrate and quality by increasing the granularity of CDN selection to a per-user level

There’s lots of ways to solve the video buffering problem depending on what type of video you are delivering, (live vs VOD), the platform or devices you are delivering it to and the user-experience you are looking to achieve. What’s your take on the best ways to reduce video buffering? Feel free to leave them in the comments section.

Streaming Summit Program and First Set of Speakers Announced: Hear from CBS, Quibi, NBC, YouTube, Twitter, Amazon, WarnerMedia/HBO

I’m pleased to announce the first set of speakers for my Streaming Summit, at NAB Show New York, taking place Oct 16-17. The program schedule has also been added to the website and when completed, we’ll have over 100 speakers, across two days of the show. Newly added speakers include executives from CBS Interactive, Quibi, NBC Sports, YouTube, Twitter, Amazon Fire TV, WarnerMedia/HBO – with lots more on the way! Register before Sept 12th using code “early” for a discount on your ticket. You can see the entire agenda on the schedule page. #streamingsummit #nabshowny

Podcast: Disney+, Quibi, HBO Max. What Happens When Content Owners Go Direct

Thanks to Beamr, Mark Donnigan and Dror Gill for having me on their “Video Insiders” podcast to talk about on Disney+, Quibi, HBO Max, Hulu, ViacomCBS, and what the forthcoming D2C launches mean for incumbents, including Netflix and Pay TV operators. Hear my thoughts on content aggregation and ideas for measuring success in OTT, along with the technical plans and platform choices being made by these developing services. This is a frank and honest, real-time and real-world conversation about the groundswell of direct to consumer OTT services which will be unleashed over the next few quarters.

 

You can listen to the podcast here: https://thevideoinsiders.simplecast.com/episodes/episode24