LY Corporation developed a centralized control plane using Central Dogma to manage service-to-service communication across its vast, heterogeneous infrastructure of physical machines, virtual machines, and Kubernetes clusters. By adopting the industry-standard xDS protocol, the new system resolves the interoperability and scaling limitations of their legacy platform while providing a robust GitOps-based workflow. This architecture enables the company to connect thousands of services with high reliability and sophisticated traffic control capabilities.
## Limitations of the Legacy System
The previous control plane environment faced several architectural bottlenecks that hindered developer productivity and system flexibility:
* **Tight Coupling:** The system was heavily dependent on a specific internal project management tool (PMC), making it difficult to support modern containerized environments like Kubernetes.
* **Proprietary Schemas:** Communication relied on custom message schemas, which created interoperability issues between different clients and versions.
* **Lack of Dynamic Registration:** The legacy setup could not handle dynamic endpoint registration effectively, functioning more as a static registry than a functional service mesh control plane.
* **Limited Traffic Control:** It lacked the ability to perform complex routing tasks, such as canary releases or advanced client-side load balancing, across diverse infrastructures.
## Central Dogma as a Control Plane
To solve these issues, the team leveraged Central Dogma, a Git-based repository service for textual configuration, to act as the foundation for a new control plane:
* **xDS Protocol Integration:** The new control plane implements the industry-standard xDS protocol, ensuring seamless compatibility with Envoy and other modern data plane proxies.
* **GitOps Workflow:** By utilizing Central Dogma’s mirroring features, developers can manage service configurations and traffic policies safely through Pull Requests in external Git repositories.
* **High Reliability:** The system inherits Central Dogma’s native strengths, including multi-datacenter replication, high availability, and a robust authorization system.
* **Schema Evolution:** The control plane automatically transforms legacy metadata into standard xDS resources, allowing for a smooth transition from old infrastructure to the new service mesh.
## Dynamic Service Discovery and Registration
The architecture provides automated ways to manage service endpoints across different environments:
* **Kubernetes Endpoint Plugin:** A dedicated plugin watches for changes in Kubernetes services and automatically updates the xDS resource tree in Central Dogma.
* **Automated API Registration:** The system provides gRPC and HTTP APIs (e.g., `RegisterLocalityLbEndpoint`) that allow services to register themselves dynamically during the startup process.
* **Advanced Traffic Features:** The new control plane supports sophisticated features like zone-aware routing, circuit breakers, automatic retries, and "slow start" mechanisms for new endpoints.
## Evolution Toward Sidecar-less Service Mesh
A major focus of the project is improving the developer experience by reducing the operational overhead of the data plane:
* **Sidecar-less Options:** The team is working toward providing service mesh benefits without requiring a sidecar proxy for every pod, which reduces resource consumption and simplifies debugging.
* **Unified Control:** Central Dogma acts as a single source of truth for both proxy-based and proxyless service mesh configurations, ensuring consistent policy enforcement across the entire organization.
For organizations managing large-scale, heterogeneous infrastructure, transitioning to an xDS-compliant control plane backed by a reliable Git-based configuration store is highly recommended. This approach balances the need for high-speed dynamic updates with the safety and auditability of GitOps, ultimately allowing for a more scalable and developer-friendly service mesh.