graph-database

1 posts

netflix

How and Why Netflix Built a Real-Time Distributed Graph: Part 1 — Ingesting and Processing Data Streams at Internet Scale | by Netflix Technology Blog | Netflix TechBlog (opens in new tab)

Netflix has developed a Real-Time Distributed Graph (RDG) to unify member interaction data across its expanding business verticals, including streaming, live events, and mobile gaming. By transitioning from siloed microservice data to a graph-based model, the company can perform low-latency, relationship-centric queries that were previously hindered by expensive manual joins and data fragmentation. The resulting system enables Netflix to track user journeys across various devices and platforms in real-time, providing a foundation for deeper personalization and pattern detection. ### Challenges of Data Isolation in Microservices * While Netflix’s microservices architecture facilitates independent scaling and service decomposition, it inherently leads to data isolation where each service manages its own storage. * Data scientists and engineers previously had to "stitch" together disparate data from various databases and the central data warehouse, which was a slow and manual process. * The RDG moves away from table-based models to a relationship-centric model, allowing for efficient "hops" across nodes without the need for complex denormalization. * This flexibility allows the system to adapt to new business entities (like live sports or games) without requiring massive schema re-architectures. ### Real-Time Ingestion and Normalization * The ingestion layer is designed to capture events from diverse upstream sources, including Change Data Capture (CDC) from databases and request/response logs. * Netflix utilizes its internal data pipeline, Keystone, to funnel these high-volume event streams into the processing framework. * The system must handle "Internet scale" data, ensuring that events from millions of members are captured as they happen to maintain an up-to-date view of the graph. ### Stream Processing with Apache Flink * Netflix uses Apache Flink as the core stream processing engine to handle the transformation of raw events into graph entities. * Incoming data undergoes normalization to ensure a standardized format, regardless of which microservice or business vertical the data originated from. * The pipeline performs data enrichment, joining incoming streams with auxiliary metadata to provide a comprehensive context for each interaction. * The final step of the processing layer involves mapping these enriched events into a graph structure of nodes (entities) and edges (relationships), which are then emitted to the system's storage layer. ### Practical Conclusion Organizations operating with a highly decoupled microservices architecture should consider a graph-based ingestion strategy to overcome the limitations of data silos. By leveraging stream processing tools like Apache Flink to build a real-time graph, engineering teams can provide stakeholders with the ability to discover hidden relationships and cross-domain insights that are often lost in traditional data warehouses.