토스 / kotlin

2 posts

toss

LLM을 이용한 서비스 취약점 분석 자동화 #2 (opens in new tab)

*이 글은 연구 개발망에서 진행된 내용을 바탕으로 합니다. 안녕하세요. 토스 Security Researcher 표상영입니다. 지난 글에서는 LLM을 이용해 서비스 취약점 분석을 자동화하면서 마주했던 문제점과 그에 대한 해결책들을 간단히 소개드렸습니다. 이전 글을 작성한 시점부터 벌써 3개월이 지났는데요. 불과 몇 달 사이에 AI의 취약점 분석 능력은 정말 높은 수준으로 올라왔습니다. 이렇게 가파른 기술 발전 속도에 따라, AI를 대하는 저의 자세와 생각도 많이 바뀌게 되었어요. 이번 글에서는…

toss

레거시 정산 개편기: 신규 시스템 투입 여정부터 대규모 배치 운영 노하우까지 (opens in new tab)

Toss Payments recently overhauled its 20-year-old legacy settlement system to overcome deep-seated technical debt and prepare for massive transaction growth. By shifting from monolithic SQL queries and aggregated data to a granular, object-oriented architecture, the team significantly improved system maintainability, traceability, and batch processing performance. The transition focused on breaking down complex dependencies and ensuring that every transaction is verifiable and reproducible. ### Replacing Monolithic SQL with Object-Oriented Logic * The legacy system relied on a "giant common query" filled with nested `DECODE`, `CASE WHEN`, and complex joins, making it nearly impossible to identify the impact of small changes. * The team applied a "Divide and Conquer" strategy, splitting the massive query into distinct domains and refined sub-functions. * Business logic was moved from the database layer into Kotlin-based objects (e.g., `SettlementFeeCalculator`), making business rules explicit and easier to test. * This modular approach allowed for "Incremental Migration," where specific features (like exchange rate conversions) could be upgraded to the new system independently. ### Improving Traceability through Granular Data Modeling * The old system stored data in an aggregated state (Sum), which prevented developers from tracing errors back to specific transactions or reusing data for different reporting needs. * The new architecture manages data at the minimum transaction unit (1:1), ensuring that every settlement result corresponds to a specific transaction. * "Setting Snapshots" were introduced to store the exact contract conditions (fee rates, VAT status) at the time of calculation, allowing the system to reconstruct the context of past settlements. * A state-based processing model was implemented to enable selective retries for failed transactions, significantly reducing recovery time compared to the previous "all-or-nothing" transaction approach. ### Optimizing High-Resolution Data and Query Performance * Managing data at the transaction level led to an explosion in data volume, necessitating specialized database strategies. * The team implemented date-based Range Partitioning and composite indexing on settlement dates to maintain high query speeds despite the increased scale. * To balance write performance and read needs, they created "Query-specific tables" that offload the processing burden from the main batch system. * Complex administrative queries were delegated to a separate high-performance data serving platform, maintaining a clean separation between core settlement logic and flexible data analysis. ### Resolving Batch Performance and I/O Bottlenecks * The legacy batch system struggled with long processing times that scaled poorly with transaction growth due to heavy I/O and single-threaded processing. * I/O was minimized by caching merchant contract information in memory at the start of a batch step, eliminating millions of redundant database lookups. * The team optimized the `ItemProcessor` in Spring Batch by implementing bulk lookups (using a Wrapper structure) to handle multiple records at once rather than querying the database for every individual item. This modernization demonstrates that scaling a financial system requires moving beyond "convenient" aggregations toward a granular, state-driven architecture. By decoupling business logic from the database and prioritizing data traceability, Toss Payments has built a foundation capable of handling the next generation of transaction volumes.