What happens when you watch stuff on Netflix?

Netflix

While watching a Netflix movie have you ever wondered how your favorite video streaming app operates? How does it earn money? How is it able to show online streaming videos without hindrance?

This article focusses on this very topic.

Netflix is a huge on-demand video streaming service provider out there. To start with lets first have a look at Netflix statistics as of today.
  • Netflix's net revenue is $20.16 billion.
  • Netflix has more than 182 million subscribers, including 69 million in the US.
  • Netflix operates in more than 200 countries.
  • Netflix plays more than 1 billion hours of video each week.
  • Netflix hosts almost 35,000 hours of content.
  • Netflix accounts for over 37% of peak internet traffic in the United States.
  • Netflix ranks #197 in Fortune 500.
Source

What we can conclude from above

Netflix is a giant in video-streaming. It has a lot of money. It has a huge subscriber base. It has a rich library of video content.

Basically, Netflix earns from its subscriptions. Subscribers are bound to pay them monthly, but they can exit anytime. So, when you think of “Netflix and chill” it is you who chills but Netflix does not.

Let’s see what Netflix does behind the scenes.

Netflix has 3 main components
  • Backend
  • Client
  • Content Delivery Network (CDN)
Backend: This is fully handled in AWS. This includes video processing in different formats and resolutions. Also, it handles requests from different client devices like phones, smart TVs, laptops, etc.

Client: This is installed on a user device to watch on-demand videos.

Content Delivery Network: Netflix uses Open Connect (OC) for global content delivery which is a system of distributed servers. Open Connect Appliances (OCA) are spread all over the globe which contains most of the Netflix’s already pre-processed and ready to deliver content.

I will dig deep into this later in this article.

How Netflix recommends

Netflix collects a huge amount of user data like videos watched, when was it watched, the location where it was watched, language preferred, how many times each video has been watched, etc. By processing this Big data and analyzing the user preferences, Netflix recommends movies, tv-series, and other videos.

Netflix is a data-driven company. It also uses the collected user data to bring out the most incredible user experience. It uses AI/machine learning to learn about what genres you like, which movie artist you follow etc. Recommendations are also made based on ratings and what other users have liked.

Onboarding of new content

The video you watch is not the exact video file that Netflix receives from production houses and studios. Netflix converts the original file into different formats to support different types of devices and platforms it supports. This process is known as Transcoding or Encoding.


The original movie that Netflix receives is many Terabytes in size. This file is so big that it becomes very difficult to deliver it as it is. For this, Netflix divides the whole file into many small chunks and processes them in parallel.

It transcodes them into different formats (E.g. mp4, avi, etc.) and differentiates according to the quality of the video (for example, 4K, 1080p, 720p, etc.) to deliver uninterrupted services even in slow network speeds. This process results in many files which can be roughly estimated as
Total number of files = number of formats * number of resolutions 

For a show The Crown, Netflix stores around 1200 files to serve this very purpose.

Did you play a movie just by seeing its catchy image?

Yes, this is what Netflix does its best.

Let's say when you want to watch a series on Netflix, you might be lured by some catchy images of some movie and you play it. These images are called header images.

The header picture is intended to interest you, to bring you into choosing a video. The thought is behind this is the more convincing the header picture, the more probable you are to watch a video. This is a kind of strategy that Netflix follows to keep its subscribers intact.

The header image is mostly not the same for everyone. Netflix has many images of a video which it uses as a header image and randomly shows them to its subscribers. It counts how many times a video was played when a particular header image was shown. Based on the number associated with every header image, an image is selected and is made as a permanent header image.

Stranger Things

For example, Stranger Things had these header images. The winner was the center image which got 1000 plays. It was then made the header image forever.

As different people might have different tastes, Netflix doesn’t just stop here with its smart work. It plans for showing different header images (let’s say from a movie) to different users according to their interests (by selecting an image from the movie that might interest the user). That is how Netflix learns from the captured user data.

How Netflix operates

By now we know Netflix has a huge customer base. But have you ever wondered how Netflix is able to cater to so many requests? Yes, you guessed it correctly. It uses the cloud. Netflix operates in two different clouds: AWS and Open Connect.

Before moving to a cloud-based approach, Netflix built a couple of datacentres on its own. To cater to such a huge number of requests it chose Vertical Scaling strategy which means it made big programs that ran on huge machines. This approach is called building a Monolith. But this approach had its shortcomings as there was a single machine that was doing every task. This resulted in a single point of failure and Netflix experienced a service outage.

To overcome this Netflix chose AWS. AWS provided them highly reliable databases, storage, and redundant datacentres. This gave Netflix the kind of service they were always looking for. Now they didn’t face a single point of failure, power outage, lack of computing power, etc. AWS did the heavy lifting for them, so Netflix decided to focus on their forte which was obviously delivering streaming videos.

AWS proved to be a cheap and reliable option

AWS is involved in everything other than delivering a video. That means scalable computing, scalable storage, business logic, scalable distributed databases, big data processing and analytics, recommendations, transcoding, and hundreds of other functions.

Netflix operates out of three AWS regions: one in North Virginia, one in Portland Oregon, and one in Dublin Ireland. Within each region, Netflix operates in three different availability zones.

The advantage of having three regions is that any region can fail, and the other regions will step in to handle all the members in the failed region. When a region fails, Netflix calls this evacuating a region.

Let’s use an example. Let’s say you’re watching a new Money Heist episode in London England. Because it’s closest to London, chances are your Netflix device is connected to the Dublin region.
What happens if the entire Dublin region fails? Does that mean Netflix should stop working for you? Of course not! 

Netflix, after detecting the failure redirects you to Virginia. Your device would now talk to the Virginia region instead of Dublin. You might not even notice there was a failure. 

Scalability

Netflix employs scalable computing using EC2 (Elastic Compute Cloud), scalable storage using S3 (Simple Storage Service) and scalable distributed database such as DynamoDB and Cassandra.
Netflix can serve a huge number of user requests by adding more computing power as EC2 instances. It pays only for what it uses. At the time of low requests, it reduces the computing power.

Similarly, it uses the S3 storage for storing all the content it has. While AWS does all the heavy lifting for Netflix, it just keeps on adding new video content to it.

Content Delivery Network (CDN)

Caching

It is a process of storing the video content in some device which is distributed all over the globe. The video cached in these devices is pre-processed, validated, and ready to be delivered. These devices are called Open Connect Appliances.

The basic idea behind CDN is the concept of caching. When a user wants to play a video, find the nearest computer (here, OCA) with the video on it and stream it to the device from there. If the video is not found on a particular computer, it is routed to the next computer which has the video.

The biggest benefits of CDN:
  • Cheap
  • Reliable
  • Fast
  • Better quality
  • More scalable
  • Fault-tolerant


Open Connect Appliances (OCA)

Netflix has its own computer system for video storage. It calls them Open Connect Appliances.  Each OCA is a fast server, highly optimized for delivering large files, with lots and lots of hard disks or flash drives for storing video.



These OCAs have a capacity of about 280 TB which is able to contain most of the Netflix content. As these OCAs are distributed across the globe this makes video streaming a lot faster as now the video would be streaming from a nearby OCA and not from a central server which is in some other continent.

Netflix loves expansion and wants to be able to play the same content all over the world at the same time. This is only possible when there is a sufficient amount of OCAs spread across the world with enough content to serve the requests.

For example, Daredevil Season 2 in 2016, was the first time Netflix released all episodes of a show, on all devices, in all countries, at the same time.

Every night when the traffic is less, each OCA content is updated. An OCA asks a service in AWS which videos it should have. The service responds with a list of videos that it should have based on the predictions made by Netflix. Generally, around 90% of all user requests are served by OCAs around the globe.


A Trick that Netflix played

Netflix placed its OCAs in the datacentres of ISPs.



Netflix uses its popularity data to predict which videos members probably will want to watch tomorrow in each OCA placed in ISP datacentre. Netflix copies the predicted videos to one or more OCAs at each location.

Why ISPs

Netflix wanted its content to be delivered anywhere across the globe quickly. For this, it chose the Internet Service Providers itself. Every ISP has its own network which provides internet services to people and connects them to the internet.

Let’s say if we do a google search, then the request first flows from the local ISP's network, goes to some server, and in the end routed to Google’s network which then responds to that request. This process takes a while. Netflix avoided this delay in video distribution by placing the OCAs with the ISPs. This enables Netflix to deliver the video quickly and easily to the nearby areas over the local ISP network without even going to the internet.

Summary

  • Netflix has 3 main components backend, client, and CDN.
  • Netflix uses two clouds for providing its services which are AWS and Open Connect.
  • Netflix recommends new movies/videos by analyzing and learning from the collected user data.
  • It adds new content by validating the source content and transcoding it into different formats.
  • Netflix uses header images making the subscriber more interested in selecting a video.
  • Netflix uses CDN for distribution of videos across the globe.
  • Netflix uses OCA which acts as a local video storage container that helps in providing continuous streaming of video.
  • Netflix placed OCAs with the ISP to speed up the delivery of the requested videos.

There is a whole lot of information that is left but to limit this post I will rest my pen now. So, the next time you start playing a movie on Netflix, just think of what Netflix just did for you to CHILL.

One question I would like to ask before I take your leave
Why Amazon helps Netflix with its AWS when Amazon (Prime video) is a competitor of Netflix in online video streaming as well?

Comments

Post a comment

Popular posts from this blog

Concurrency vs Parallelism

DHCP: Dynamic Host Configuration Protocol