• Fullstack Data Engineer
  • Posts
  • Scaling Video Delivery: How Netflix and Hulu Leveraged a Similar Architecture to Serve Millions of Users Worldwide

Scaling Video Delivery: How Netflix and Hulu Leveraged a Similar Architecture to Serve Millions of Users Worldwide

Netflix and Hulu deliver video at scale using a cloud-based infrastructure and third-party CDNs to build video delivery platforms that serve millions of users worldwide while maintaining a high-quality user experience. A 2014 paper titled “Measurement Study of Netflix, Hulu, and a Tale of Three CDNs” examines the similarities between Netflix and Hulu’s architecture used to deliver video at scale. Netflix and Hulu used a static CDN selection strategy to determine a client’s best CDN. The authors propose a more optimal CDN selection strategy that could provide a better quality of service to users, However, they note that business constraints can override the priority of providing the most technically optimal experience.

This underscores an important principle when designing a system: Business requirements always supersede providing an “optimal” technical experience. While a technically optimal solution may be desirable, business considerations such as cost, time, and practicality take precedence in decision-making and design. A technically optimal solution may be too costly or complex at scale, so the system architect must weigh all possible solutions and balance them with business requirements to make the best decision for the overall customer experience.

System Design Components

  • Content Delivery Networks (CDN)

  • Manifest File

  • DNS

  • Adaptive Streaming using the DASH protocol or RTMP (Real Time Messaging Protocol)

  • Frontend server

  • Backend server

  • Client with video playback

  • Logging

System Design Principles

Leverage third-party cloud providers and CDNs to dynamically scale to millions of users and server content as close to the user as possible.
Hulu and Netflix used the same architecture of third-party CDNs and cloud providers to host critical infrastructure to flexibly scale up and down for changing bandwidth and user load. This is now standard practice in the industry, but at the time of publishing in 2011, Netflix and Hulu were pioneers in migrating to the cloud. Netflix started its migration to AWS as a gradual process that began in 2008 and was completed in 2015. Both Netflix and Hulu had their own data centers, but the paper shows that most of their critical infrastructure was offloaded to either AWS for Netflix or Akamai for Hulu. They also used the same 3 CDN providers to host their content.

Business constraints can and most likely will supersede the priority to provide customers with the most “technically” optimal experience.
The core motivation for this paper was to devise a more optimal CDN selection strategy that both Netflix and Hulu could use to prove a “better” quality of service for streaming video content from CDNs. The paper’s authors propose a more optimal way for both Netflix and Hulu to perform a more responsive CDN selection on the client based on changing network conditions and the quality of service a CDN provides.

At the time, both Netflix and Hulu used static assignment of CDNs on the backend, allowing clients to fall back to other CDNs. However, the clients did not perform any sophisticated logic to switch between CDNs if bandwidth conditions changed dynamically. Instead, the clients would stick to the same CDN until bandwidth reached almost zero and switch to another provider.

The authors discovered that Hulu’s CDN selection followed a Gaussian distribution where Level3 was preferred 47% of the time. Level3 CDN was preferred over others and was not dependent on real-time network conditions. Netflix deployed a static user-based CDN selection strategy that did not change over a period of a few days. The authors note that this selection strategy is most likely based on pricing and business arrangements. Hulu could have had a particular business pricing or quota to use Level3 CDN in favor of other CDNs.

Static assignment on the backend is better than dynamic, complex frontend logic.
The authors propose a more optimal CDN selection strategy where clients would either use 3 CDNs at the same time to download three chunks in parallel. They also define the problem as a formula that can take in the business constraints to produce the “optimal” CDN assignment. Adding this complex logic to all of the various clients that Netflix and Hulu support creates a lot of complexity on the frontend that cannot be changed or modified easily. Statically assigning users to particular CDNs on the backend based on business arrangements or distributing the load more evenly across all CDNs is more straightforward to reason about and modify based on changing conditions.

When you need to support hundreds of clients with different hardware and network capabilities, push the complex logic to the backend as much as possible. You want to keep the frontend clients as simple as possible to render UI or presentation layer with the ability for the backend to update the client with new configurations or capabilities dynamically. Although Netflix and Hulu could have devised an optimal CDN selection algorithm on the frontend, it adds a lot of complexity. It makes it difficult to update if there is a change in the algorithm or a bug is discovered.

System Architecture

  • www.hulu.com

    • Client gets HTML pages for the video from Hulu’s frontend web server hosted by Akamai

  • s.hulu.com

    • Frontend then contacts s.hulu.com to get a manifest file describing the server location, available bit-rates, etc.

  • t.hulu.com

    • Client sends log reports to t.hulu.com hosted by Hulu

  • Three CDNs

    • Both Netflix and Hulu leveraged the same 3 CDN providers to host their video content: Akamai, Limelight, and Level3.

User Flow

  • User visits www.hulu.com and gets back HTML website from frontend server

  • Client makes a request to s.hulu.com to get a manifest file to get metadata about video content

  • Client is assigned a CDN based on the manifest file and uses DNS to select a server IP address

  • Client uses encrypted RTMP (Real Time Messaging Protocol) to deliver movies to desktop browsers. For mobile devices, HuluPlus uses adaptive streaming over HTTP. Hulu advertisements are single .FLV files are downloaded over a single HTTP transaction.

  • CDN selection changes on every video.

  • Hulu player sends reports to backend server t.hulu.com with detailed performance information such as video bit-rate, current video playback position, total amount of memory the client is using, the current bandwidth at the client machine, number of buffer under-runs, and number of dropped frames.

Key Terms

CDN

Content Delivery Network is a system of distributed servers that deliver content to end-users based on their geographic location, helping to reduce latency and improve the user experience.

RTMP

RTMP (Real-Time Messaging Protocol) is a protocol developed by Adobe for streaming audio, video, and data over the internet between a client and a server. It was originally designed for streaming media from Adobe's Flash Player, but has since been adopted by other streaming platforms.

RTMP operates on top of TCP and uses port 1935 to communicate between the client and server. It is a low-latency protocol that is often used for live video streaming and interactive applications that require real-time communication.

RTMP has been largely replaced by other protocols like HLS and DASH for video streaming, but it is still used in some applications, especially for live streaming events.