Design Youtube
Quick notes
- Video streaming from CDN to client use HLS
- Consider using Alexu System Design Interview/14. Design youtube/DAG Scheduler Pattern instead of Alexu System Design Interview/1. Scale from zero to millions of users/Message queue to have a more control over the worker group
- Video are serving incrementally via CDN
High design overview
We basically follow this concept
This is because when streaming the video, a large data will be transfered and caching is very important. As a result, we can use CDN to save cost and deliver better performance
General flow
It's important to keep in mind that when the user upload it's going to be raw format. These needed to be encoded before uploading to streaming website.
Therefore we need to do a batch job processing.
- The user upload the video meta data through our normal REST service which stored in a regular DB
- After the user finished uploading meta data. The metadata server return an upload link to the user with the URL to upload Raw video blob storage. (See more: The right way to UPLOAD a file using REST > 2. Upload the metadata first and then upload the file)
- After uploading to the
Blob Storage
.Transcoding server
can take the data and start our transcoding queue. Since the transcoding job is heavy, we adapt a worker pattern.- To know when there is a new file upload to our
Blob Storage
, we can use a notification system. For example, in S3 we can use SQS.
- To know when there is a new file upload to our
- Transcoding server receives status after all the workers has finished, and upload the transcoded video to
Transcoded video blob storage
. - After that it's cached in
CDN
- The user can then watch the video on
CDN
(CDN Consideration)- The reason why we use CDN here is to guarantee a better connection at the edge server for the users.
Popular streaming protocol
- Alexu System Design Interview/14. Design youtube/DASH
- Apple HLS (Good pick for IOS compatible)
- Microsoft Smooth streaming
- Adobe HTTP Dynamic Streaming (HDS)
- RTSP (low-latency streaming)
- RTMP (high quality streaming)
Design deep drive
Transcoding
- Transcoding is needed to ensure the videos are compatible between devices and browsers.
- We also want to transcode it into different resolution:
- High network bandwidth user will be suggested with higher resolution
- Poor network bandwith user will be suggested with lower resolution
The video normally has 2 parts:
- Container:
- Stores metadata, video files, audio
- In the format of
.mov
,.mp4
, …
- Codecs:
- Compression and decompression algorithm but preserve the quality of the video
- Example:
H.264
,VP9
,HEVC
Given the above consideration including transcoding, let's revise our design as follows:
Pre-processing
Processing the video comes with a lot of task, for example:
- Generating the thumb nail
- Video transcoding
- Watermark
- Audio encoding
- Metadata processing etc.
It's more efficient if we can process the video chunk by chunk. As a result, our pre-processor will:
- Split video into smaller chunks
- Create DAG Config files
- Do some caching of the finished parts for retry purposes.
DAG Scheduler
Takes a list of task corresponding to DAG Config file and prioritise the tasks.
Resource Manager
Pull task from the Task Queue
and worker from the Worker Queue
. Select the most suited worker for the task.
Doing this will make sure that we're not wasting any resources and all the worker will be as busy as possible.
When all the job is finished, the worker which responsible for merging all the chunks, will upload it to Transcoded video blob storage
Optimisation
Parallel video uploading
We can consider chunks uploading if possible so a client can upload multiple chunks to the Raw video blob storage
instead of one big file as the whole.
Doing this can enable Resumable Upload
Speed optimisation
Use different CDN in closer location to the user to enhance the upload speed
Parallel system everywhere
We can optimise our system parallely by chunk, for example:
For each chunk we started uploading, we can start processing immediately instead of waiting for the whole video to be uploaded before processing
Pre-sign upload URL
If we following a The right way to UPLOAD a file using REST > 2. Upload the metadata first and then upload the file, we can generate a pre-sign upload URL so that it's more secure for the client to talk to AWS S3 directly.
Other optimisation
- DRM (Digital right management) protection
- Cache CDN only for popular videos, other videos redirect to server (cost-saving)
- Retry when error occured (rescheduler DAG, retry transcoding, re-generate DAG diagram, …)