grpc

2 posts

line

Replacing the Payment System DB Handling (opens in new tab)

The LINE Billing Platform successfully migrated its large-scale payment database from Nbase-T to Vitess to handle high-traffic global transactions. While initially exploring gRPC for its performance reputation, the team transitioned to the MySQL protocol to ensure stability and reduce CPU overhead within their Java-based environment. This implementation demonstrates how Vitess can manage complex sharding requirements while maintaining high availability through automated recovery tools. ### Protocol Selection and Implementation - The team initially attempted to use the gRPC protocol but encountered `http2: frame too large` errors and significant CPU overhead during performance testing. - Manual mapping of query results to Java objects proved cumbersome with the Vitess gRPC client, leading to a shift toward the more mature and recommended MySQL protocol. - Using the MySQL protocol allowed the team to leverage standard database drivers while benefiting from Vitess's routing capabilities via VTGate. ### Keyspace Architecture and Data Routing - The system utilizes a dual-keyspace strategy: a "Global Keyspace" for unsharded metadata and a "Service Keyspace" for sharded transaction data. - The Global Keyspace manages sharding keys using a "sequence" table type to ensure unique, auto-incrementing identifiers across the platform. - The Service Keyspace is partitioned into $N$ shards using a hash-based Vindex, which distributes coin balances and transaction history. - VTGate automatically routes queries to the correct shard by analyzing the sharding key in the `WHERE` clause or `INSERT` statement, minimizing cross-shard overhead. ### MySQL Compatibility and Transaction Logic - Vitess maintains `REPEATABLE READ` isolation for single-shard transactions, while multi-shard transactions default to `READ COMMITTED`. - Advanced features like Two-Phase Commit (2PC) are available for handling distributed transactions across multiple shards. - Query execution plans are analyzed using `VEXPLAIN` and `VTEXPLAIN`, often managed through the VTAdmin web interface for better visibility. - Certain limitations apply, such as temporary tables only being supported in unsharded keyspaces and specific unsupported SQL cases documented in the Vitess core. ### Automated Operations and Monitoring - The team employs VTOrc (based on Orchestrator) to automatically detect and repair database failures, such as unreachable primaries or replication stops. - Monitoring is centralized via Prometheus, which scrapes metrics from VTOrc, VTGate, and VTTablet components at dedicated ports (e.g., 16000). - Real-time alerts are routed through Slack and email, using `tablet_alias` to specifically identify which MySQL node or VTTablet is experiencing issues. - A web-based recovery dashboard provides a history of automated fixes, allowing operators to track the health of the cluster over time. For organizations migrating high-traffic legacy systems to a cloud-native sharding solution, prioritizing the MySQL protocol over gRPC is recommended for better compatibility with existing application frameworks and reduced operational complexity.

line

Introducing a case study of (opens in new tab)

LY Corporation’s ABC Studio developed a specialized retail Merchant system by leveraging Domain-Driven Design (DDD) to overcome the functional limitations of a legacy food-delivery infrastructure. The project demonstrates that the primary value of DDD lies not just in technical implementation, but in aligning organizational structures and team responsibilities with domain boundaries. By focusing on the roles and responsibilities of the system rather than just the code, the team created a scalable platform capable of supporting diverse consumer interfaces. ### Redefining the Retail Domain * The legacy system treated retail items like restaurant entries, creating friction for specialized retail services; the new system was built to be a standalone platform. * The team narrowed the domain focus to five core areas: Shop, Item, Category, Inventory, and Order. * Sales-specific logic, such as coupons and promotions, was delegated to external "Consumer Platforms," allowing the Merchant system to serve as a high-performance information provider. ### Clean Architecture and Modular Composition * The system utilizes Clean Architecture to ensure domain entities remain independent of external frameworks, which also provided a manageable learning curve for new team members. * Services are split into two distinct modules: "API" modules for receiving external requests and "Engine" modules for processing business logic. * Communication between these modules is handled asynchronously via gRPC and Apache Kafka, using the Decaton library to increase throughput while maintaining a low partition count. * The architecture prioritizes eventual consistency, allowing for high responsiveness and scalability across the platform. ### Global Collaboration and Conway’s Law * Development was split between teams in Korea (Core Domain) and Japan (System Integration and BFF), requiring a shared understanding of domain boundaries. * Architectural Decision Records (ADR) were implemented to document critical decisions and prevent "knowledge drift" during long-term collaboration. * The organizational structure was intentionally designed to mirror the system architecture, with specific teams (Core, Link, BFF, and Merchant Link) assigned to distinct domain layers. * This alignment, reflecting Conway’s Law, ensures that changes to external consumer platforms have minimal impact on the stable core domain logic. Successful DDD adoption requires moving beyond technical patterns like hexagonal architecture and focusing on establishing a shared understanding of roles across the organization. By structuring teams to match domain boundaries, companies can build resilient systems where the core business logic remains protected even as the external service ecosystem evolves.