Toss Payments recently overhauled its 20-year-old legacy settlement system to overcome deep-seated technical debt and prepare for massive transaction growth. By shifting from monolithic SQL queries and aggregated data to a granular, object-oriented architecture, the team significantly improved system maintainability, traceability, and batch processing performance. The transition focused on breaking down complex dependencies and ensuring that every transaction is verifiable and reproducible.
### Replacing Monolithic SQL with Object-Oriented Logic
* The legacy system relied on a "giant common query" filled with nested `DECODE`, `CASE WHEN`, and complex joins, making it nearly impossible to identify the impact of small changes.
* The team applied a "Divide and Conquer" strategy, splitting the massive query into distinct domains and refined sub-functions.
* Business logic was moved from the database layer into Kotlin-based objects (e.g., `SettlementFeeCalculator`), making business rules explicit and easier to test.
* This modular approach allowed for "Incremental Migration," where specific features (like exchange rate conversions) could be upgraded to the new system independently.
### Improving Traceability through Granular Data Modeling
* The old system stored data in an aggregated state (Sum), which prevented developers from tracing errors back to specific transactions or reusing data for different reporting needs.
* The new architecture manages data at the minimum transaction unit (1:1), ensuring that every settlement result corresponds to a specific transaction.
* "Setting Snapshots" were introduced to store the exact contract conditions (fee rates, VAT status) at the time of calculation, allowing the system to reconstruct the context of past settlements.
* A state-based processing model was implemented to enable selective retries for failed transactions, significantly reducing recovery time compared to the previous "all-or-nothing" transaction approach.
### Optimizing High-Resolution Data and Query Performance
* Managing data at the transaction level led to an explosion in data volume, necessitating specialized database strategies.
* The team implemented date-based Range Partitioning and composite indexing on settlement dates to maintain high query speeds despite the increased scale.
* To balance write performance and read needs, they created "Query-specific tables" that offload the processing burden from the main batch system.
* Complex administrative queries were delegated to a separate high-performance data serving platform, maintaining a clean separation between core settlement logic and flexible data analysis.
### Resolving Batch Performance and I/O Bottlenecks
* The legacy batch system struggled with long processing times that scaled poorly with transaction growth due to heavy I/O and single-threaded processing.
* I/O was minimized by caching merchant contract information in memory at the start of a batch step, eliminating millions of redundant database lookups.
* The team optimized the `ItemProcessor` in Spring Batch by implementing bulk lookups (using a Wrapper structure) to handle multiple records at once rather than querying the database for every individual item.
This modernization demonstrates that scaling a financial system requires moving beyond "convenient" aggregations toward a granular, state-driven architecture. By decoupling business logic from the database and prioritizing data traceability, Toss Payments has built a foundation capable of handling the next generation of transaction volumes.