레거시 정산 개편기: 신규 시스템 투입 여정부터 대규모 배치 운영 노하우까지 (opens in new tab)
Toss Payments recently overhauled its 20-year-old legacy settlement system to overcome deep-seated technical debt and prepare for massive transaction growth. By shifting from monolithic SQL queries and aggregated data to a granular, object-oriented architecture, the team significantly improved system maintainability, traceability, and batch processing performance. The transition focused on breaking down complex dependencies and ensuring that every transaction is verifiable and reproducible. ### Replacing Monolithic SQL with Object-Oriented Logic * The legacy system relied on a "giant common query" filled with nested `DECODE`, `CASE WHEN`, and complex joins, making it nearly impossible to identify the impact of small changes. * The team applied a "Divide and Conquer" strategy, splitting the massive query into distinct domains and refined sub-functions. * Business logic was moved from the database layer into Kotlin-based objects (e.g., `SettlementFeeCalculator`), making business rules explicit and easier to test. * This modular approach allowed for "Incremental Migration," where specific features (like exchange rate conversions) could be upgraded to the new system independently. ### Improving Traceability through Granular Data Modeling * The old system stored data in an aggregated state (Sum), which prevented developers from tracing errors back to specific transactions or reusing data for different reporting needs. * The new architecture manages data at the minimum transaction unit (1:1), ensuring that every settlement result corresponds to a specific transaction. * "Setting Snapshots" were introduced to store the exact contract conditions (fee rates, VAT status) at the time of calculation, allowing the system to reconstruct the context of past settlements. * A state-based processing model was implemented to enable selective retries for failed transactions, significantly reducing recovery time compared to the previous "all-or-nothing" transaction approach. ### Optimizing High-Resolution Data and Query Performance * Managing data at the transaction level led to an explosion in data volume, necessitating specialized database strategies. * The team implemented date-based Range Partitioning and composite indexing on settlement dates to maintain high query speeds despite the increased scale. * To balance write performance and read needs, they created "Query-specific tables" that offload the processing burden from the main batch system. * Complex administrative queries were delegated to a separate high-performance data serving platform, maintaining a clean separation between core settlement logic and flexible data analysis. ### Resolving Batch Performance and I/O Bottlenecks * The legacy batch system struggled with long processing times that scaled poorly with transaction growth due to heavy I/O and single-threaded processing. * I/O was minimized by caching merchant contract information in memory at the start of a batch step, eliminating millions of redundant database lookups. * The team optimized the `ItemProcessor` in Spring Batch by implementing bulk lookups (using a Wrapper structure) to handle multiple records at once rather than querying the database for every individual item. This modernization demonstrates that scaling a financial system requires moving beyond "convenient" aggregations toward a granular, state-driven architecture. By decoupling business logic from the database and prioritizing data traceability, Toss Payments has built a foundation capable of handling the next generation of transaction volumes.