nginx

2 posts

netflix

Netflix Live Origin. Xiaomei Liu, Joseph Lynch, Chris Newton | by Netflix Technology Blog | Dec, 2025 | Netflix TechBlog (opens in new tab)

The Netflix Live Origin is a specialized, multi-tenant microservice designed to bridge the gap between cloud-based live streaming pipelines and the Open Connect content delivery network. By operating as an intelligent broker, it manages content selection across redundant regional pipelines to ensure that only valid, high-quality segments are distributed to client devices. This architecture allows Netflix to achieve high resilience and stream integrity through server-side failover and deterministic segment selection. ### Multi-Pipeline and Multi-Region Awareness * The origin server mitigates common live streaming defects, such as missing segments, timing discontinuities, and short segments containing missing video or audio samples. * It leverages independent, redundant streaming pipelines across different AWS regions to ensure high availability; if one pipeline fails or produces a defective segment, the origin selects a valid candidate from an alternate path. * Implementation of epoch locking at the cloud encoder level allows the origin to interchangeably select segments from various pipelines. * The system uses lightweight media inspection at the packager level to generate metadata, which the origin then uses to perform deterministic candidate selection. ### Stream Distribution and Protocol Integration * The service operates on AWS EC2 instances and utilizes standard HTTP protocol features for communication. * Upstream packagers use HTTP PUT requests to push segments into storage at specific URLs, while the downstream Open Connect network retrieves them via GET requests. * The architecture is optimized for a manifest design that uses segment templates and constant segment durations, which reduces the need for frequent manifest refreshes. ### Open Connect Streaming Optimization * While Netflix’s Open Connect Appliances (OCAs) were originally optimized for VOD, the Live Origin extends nginx proxy-caching functionality to meet live-specific requirements. * OCAs are provided with Live Event Configuration data, including Availability Start Times and initial segment numbers, to determine the legitimate range of segments for an event. * This predictive modeling allows the CDN to reject requests for objects outside the valid range immediately, reducing unnecessary traffic and load on the origin. By decoupling the live streaming pipeline from the distribution network through this specialized origin layer, Netflix can maintain a high level of fault tolerance and stream stability. This approach minimizes client-side complexity by handling failovers and segment selection on the server side, ensuring a seamless experience for viewers of live events.

line

Flexible Multi-site Architecture Designed (opens in new tab)

LINE NEXT optimized its web server infrastructure by transitioning from fragmented, manual Nginx setups to a centralized native Nginx multi-site architecture. By integrating global configurations and automating the deployment pipeline with Ansible, the team successfully reduced service launch lead times by over 80% while regaining the ability to use advanced features like GeoIP and real client IP tracking. This evolution ensures that the infrastructure can scale to support over 100 subdomains across diverse global services with high reliability and minimal manual overhead. ## Evolution of Nginx Infrastructure * **PMC-based Structure**: The initial phase relied on a Project Management Console using `rsync` via SSH; this created security risks and led to fragmented, siloed configurations that were difficult to maintain. * **Ingress Nginx Structure**: To improve speed, the team moved to Kubernetes-based Ingress using Helm charts, which automated domain and certificate settings but limited the use of native Nginx modules and complicated the retrieval of real client IP addresses. * **Native Nginx Multi-site Structure**: The current hybrid approach utilizes native Nginx managed by Ansible, combining the speed of configuration-driven setups with the flexibility to use advanced modules like GeoIP and Loki for log collection. ## Configuration Integration and Multi-site Management * **Master Configuration Extraction**: Common directives such as `timeouts`, `keep-alive` settings, and `log formats` were extracted into a master Nginx configuration file to eliminate redundancy across services. * **Hierarchical Directory Structure**: Inspired by Apache, the team adopted a `sites-available` structure where individual `server` blocks for different services (alpha, beta, production) are managed in separate files. * **Operational Efficiency**: This integrated structure allows a single Nginx instance to serve multiple sites simultaneously, significantly reducing the time required to add and deploy new service domains. ## Automated Deployment with Ansible * **Standardized Workflow**: The team replaced manual processes with Ansible playbooks that handle everything from cloning the latest configuration from Git to extracting environment-specific files. * **Safety and Validation**: The automated pipeline includes mandatory Nginx syntax verification (`nginx -t`) and process status checks to ensure stability before a deployment is finalized. * **Rolling Deployments**: To minimize service impact, updates are pushed sequentially across servers; the process automatically halts if an error is detected at any stage of the rollout. To effectively manage a rapidly expanding portfolio of global services, infrastructure teams should move toward a "configuration-as-code" model that separates common master settings from service-specific logic. Leveraging automation tools like Ansible alongside a native Nginx multi-site structure provides the necessary balance between rapid deployment and the granular control required for complex logging and security requirements.