test-automation

7 posts

toss

Toss Income QA Platform (opens in new tab)

Toss's QA team developed an internal "QA Platform" to solve the high barrier to entry associated with using Swagger for manual testing and data setup. By transforming complex, multi-step API calls into a simple, button-based GUI, the team successfully empowered non-QA members to perform self-verification. This shift effectively moved quality assurance from a final-stage bottleneck to a continuous, integrated part of the development process, significantly increasing product delivery speed. ### Lowering the Barrier to Test APIs * Existing Swagger documentation was functionally complete but difficult for developers or planners to use due to the need for manual JSON editing and sequential API execution. * The QA Platform does not create new APIs; instead, it provides a GUI layer over existing Swagger Test APIs to make them accessible without technical documentation. * The system offers two distinct interfaces: "Normal Mode" for simplified, one-click testing and "Swagger Mode" for granular control over request bodies and parameters. ### From Manual Clicks to Automation and Management * Phase 1 focused on visual accessibility, allowing users to trigger complex data states via buttons rather than manual API orchestration. * Phase 2 integrates existing automation scripts into the platform, removing the need for local environment setups and allowing anyone to execute automated test suites. * The final phase aims to transition into a comprehensive Test Management System (TMS) tailored to the team's specific workflow, reducing reliance on third-party external tools. ### Redefining Quality as a Design Choice * By reducing the time and mental effort required to run a test, verification became a frequent, daily habit for the entire product team rather than a chore for the QA department. * Lowering the "cost" of testing replaced guesswork with data-driven confidence, allowing the team to move faster during development. * This initiative reflects a philosophical shift where quality is no longer viewed as a final checklist item but as a core structural element designed into the development lifecycle. The primary takeaway for engineering teams is that the speed of a product is often limited by the friction of its testing process. By building internal tools that democratize testing capabilities—making them available to anyone regardless of their technical role—organizations can eliminate verification delays and foster a culture where quality is a shared responsibility.

toss

Tax Refund Automation: AI (opens in new tab)

At Toss Income, QA Manager Suho Jung successfully automated complex E2E testing for diverse tax refund services by leveraging AI as specialized virtual team members. By shifting from manual coding to a "human-as-orchestrator" model, a single person achieved the productivity of a four-to-five-person automation team within just five months. This approach overcame the inherent brittleness of testing long, React-based flows that are subject to frequent policy changes and external system dependencies. ### Challenges in Tax Service Automation The complexity of tax refund services presented unique hurdles that made traditional manual automation unsustainable: * **Multi-Step Dependencies:** Each refund flow averages 15–20 steps involving internal systems, authentication providers, and HomeTax scraping servers, where a single timing glitch can fail the entire test. * **Frequent UI and Policy Shifts:** Minor UI updates or new tax laws required total scenario reconfigurations, making hard-coded tests obsolete almost immediately. * **Environmental Instability:** Issues such as "Target closed" errors during scraping, differing domain environments, and React-specific hydration delays caused constant test flakiness. ### Building an AI-Driven QA Team Rather than using AI as a simple autocomplete tool, the project assigned specific "personas" to different AI models to handle distinct parts of the lifecycle: * **SDET Agent (Claude Sonnet 4.5):** Acted as the lead developer, responsible for designing the Page Object Model (POM) architecture, writing test logic, and creating utility functions. * **Documentation Specialist:** Automatically generated daily retrospectives and updated technical guides by analyzing daily git commits. * **Git Master:** Managed commit history and PR descriptions to ensure high-quality documentation of the project’s evolution. * **Pair Programmers (Cursor & Codex):** Handled real-time troubleshooting, type errors, and comparative analysis of different test scripts. ### Technical Solutions for React and Policy Logic The team implemented several sophisticated technical strategies to ensure test stability: * **React Interaction Readiness:** To solve "Element is not clickable" errors, they developed a strategy that waits not just for visibility, but for event handlers to bind to the DOM (Hydration). * **Safe Interaction Fallbacks:** A standard `click` utility was created that attempts a Playwright click, then a native keyboard 'Enter' press, and finally a JS dispatch to ensure interactions succeed even during UI transitions. * **Dynamic Consent Flow Utility:** A specialized system was built to automatically detect and handle varying "Terms of Service" agreements across different sub-services (Tax Secretary, Hidden Refund, etc.) through a single unified function. * **Test Isolation:** Automated scripts were used to prevent `userNo` (test ID) collisions, ensuring 35+ complex scenarios could run in parallel without data interference. ### Integrated Feedback and Reporting The automation was integrated directly into internal communication channels to create a tight feedback loop: * **Messenger Notifications:** Every test run sends a report including execution time, test IDs, and environment data to the team's messenger. * **Automated Failure Analysis:** When a test fails, the AI automatically posts the error log, the specific failed step, a tracking EventID, and a screenshot as a thread reply for immediate debugging. * **Human-AI Collaboration:** This structure shifted the QA's role from writing code to discussing failures and policy changes within the messenger threads. The success of this 5-month experiment suggests that for high-complexity environments, the future of QA lies in "AI Orchestration." Instead of focusing on writing selectors, QA engineers should focus on defining problems and managing the AI agents that build the architecture.

woowahan

Test Automation with AI: (opens in new tab)

This blog post explores how a development team at Woowahan Tech successfully automated the creation of 100 unit tests in just 30 minutes by combining a custom IntelliJ plugin with Amazon Q. The author argues that while full AI automation often fails in complex multi-module environments, a hybrid approach using "compile-guaranteed templates" ensures high success rates and maintains operational stability. This strategy allows developers to bypass repetitive setup tasks while leveraging AI for logic implementation within a strictly defined, valid structure. ### Evaluating AI Assistants for Testing * The team compared various AI tools including GitHub Copilot, Cursor, and Amazon Q to determine which best fit their existing IntelliJ-based workflow. * Amazon Q was selected for its superior understanding of the entire project context and its ability to integrate seamlessly as a plugin without requiring a switch to a new IDE. * Initial manual use of AI assistants highlighted repetitive patterns: developers had to constantly specify team conventions (Kotest FunSpec, MockK) and manually fix build errors in 15% of the generated code. * On average, it took 10 minutes per class to generate and refine tests manually, prompting the team to seek a more automated solution via a custom plugin. ### The Pitfalls of Full Automation * The first version of the custom plugin attempted to generate complete test files by gathering class metadata through PSI (Program Structure Interface) and sending it to the Gemini API. * Pilot tests revealed a 90% compilation failure rate, as the AI frequently generated incorrect imports, hallucinated non-existent fields, or used mismatched data types. * A critical issue was the "loss of existing tests," where the AI-generated output would completely overwrite previous work rather than appending to it. * In complex multi-module projects, the AI struggled to identify the correct classes when multiple modules contained identical class names, leading to significant manual correction time. ### Shifting to Compile-Guaranteed Templates * To overcome the limitations of full automation, the team pivoted to a "template first" approach where the plugin generates a valid, compilable shell for the test. * The plugin handles the complex infrastructure of the test file, including correct imports, MockK setups, and empty test stubs for every method in the target class. * This approach reduces the AI's "hallucination surface" by providing it with a predefined structure, allowing tools like Amazon Q to focus solely on filling in the implementation details. * By automating the 1-minute setup and letting the AI handle the 2-minute implementation phase, the team achieved a 97% success rate across 100 test cases. ### Practical Conclusion For teams looking to improve test coverage in large-scale repositories, the most effective strategy is to use IDE plugins to automate context gathering and boilerplate generation. By providing the AI with a structurally sound template, developers can eliminate compilation errors and significantly reduce the time spent on manual refinement, ensuring that even complex edge cases are covered with minimal effort.

toss

Toss Income Tax Refund Service: An (opens in new tab)

Toss Income’s QA team transitioned from traditional manual testing and rigid class-based Page Object Models (POM) to a stateless Functional POM to keep pace with rapid deployment cycles. This shift allowed them to manage complex tax refund logic and frequent UI changes with high reliability and minimal maintenance overhead. By treating automation as a modular assembly of functions, they successfully reduced verification times from four hours to twenty minutes while significantly increasing test coverage. ### Transitioning to Functional POM * Replaced stateful classes and complex inheritance with stateless functions that receive a `page` object as input and return the updated `page` as output. * Adopted a clear naming convention (e.g., `gotoLoginPage`, `enterPhonePin`, `verifyRefundAmount`) to ensure that test cases read like human-readable scenarios. * Centralized UI selectors and interaction logic within these functions, allowing developers to update a single point of truth when UI text or button labels change. ### Modularizing the User Journey * Segmented the complex tax refund process into four distinct modules: Login/Terms, Deduction Checks, Refund/Payment Info, and Reporting. * Developed independent, reusable functions for specific data inputs—such as medical or credit card deductions—which can be assembled like "Lego blocks" to create new test scenarios rapidly. * Decoupled business logic from UI interactions, enabling the team to create diverse test cases by simply varying parameters like amounts or dates. ### Robust Interaction and Page Management * Implemented a 4-step "Robust Click Strategy" to eliminate flakiness caused by React rendering timings, sequentially trying an Enter key press, a standard click, a forced click, and finally a direct JavaScript execution. * Created a `waitForNetworkIdleSafely` utility that prevents test failures during polling or background network activity by prioritizing UI anchors over strict network idleness. * Standardized page transition handling with a `getLatestNonScrapePage` utility, ensuring the `currentPage` object always points to the most recent active tab or redirect window. ### Integration and Performance Outcomes * Achieved a 600% increase in test coverage, expanding from 5 core scenarios to 35 comprehensive automated flows. * Reduced the time required to respond to UI changes by 98%, as modifications are now localized to a single POM function rather than dozens of test files. * Established a 24/7 automated validation system that provides immediate feedback on functional correctness, data integrity (tax amount accuracy), and performance metrics via dedicated communication channels. For engineering teams operating in high-velocity environments, adopting a stateless, functional approach to test automation is a highly effective way to reduce technical debt. By focusing on modularity and implementing fallback strategies for UI interactions, teams can transform QA from a final bottleneck into a continuous, data-driven validation layer that supports rapid experimentation.

toss

Frontend Code That Lasts (opens in new tab)

Toss Payments evolved its Payment SDK to solve the inherent complexities of integrating payment systems, where developers must navigate UI implementation, security flows, and exception handling. By transitioning from V1 to V2, the team moved beyond simply providing a library to building a robust, architecture-driven system that ensures stability and scalability across diverse merchant environments. The core conclusion is that a successful SDK must be treated as a critical infrastructure layer, relying on modular design and deep observability to handle the unpredictable nature of third-party runtimes. ## The Unique Challenges of SDK Development * SDK code lives within the merchant's runtime environment, meaning it shares the same lifecycle and performance constraints as the merchant’s own code. * Internal logging can inadvertently create bottlenecks; for instance, adding network logs to a frequently called method can lead to "self-DDoS" scenarios that crash the merchant's payment page. * Type safety is a major hurdle, as merchants may pass unexpected data types (e.g., a number instead of a string), causing fatal runtime errors like `startsWith is not a function`. * The SDK acts as a bridge for technical communication, requiring it to function as both an API consumer for internal systems and an API provider for external developers. ## Ensuring Stability through Observability * To manage the unpredictable ways merchants use the SDK, Toss implemented over 300 unit tests and 500 E2E integration tests based on real-world use cases. * The team utilizes a "Global Trace ID" to track a single payment journey across both the frontend and backend, allowing for seamless debugging across the entire system. * A custom Monitoring CLI was developed to compare payment success rates before and after deployments, categorized by merchant and runtime environment (e.g., PC Chrome vs. Android WebView). * This observability infrastructure enables the team to quickly identify edge-case failures—such as a specific merchant's checkout failing only on mobile WebViews—which are often missed by standard QA processes. ## Scaling with Modular Architecture * To avoid "if-statement hell" caused by merchant-specific requirements (e.g., fixing installment months or custom validation for a specific store), Toss moved to a "Lego-block" architecture. * The SDK is organized into three distinct layers based on the "reason for change" principle: * **Public Interface Layer:** Manages the contract with the merchant, validating inputs and translating them into internal domain models. * **Domain Layer:** Encapsulates core business logic and payment policies, keeping them isolated from external changes. * **External Service Layer:** Handles dependencies like Server APIs and Web APIs, ensuring technical shifts don't leak into the business logic. * This separation allows the team to implement custom merchant logic by swapping specific blocks without modifying the core codebase, reducing the risk of regressions and lowering maintenance costs. For developers building SDKs or integration tools, the shift from monolithic logic to a layered, observable architecture is essential. Prioritizing the separation of domain logic from public interfaces and investing in environment-specific monitoring allows for a highly flexible product that remains stable even as the client-side environment grows increasingly complex.

line

Into the Passionate Energy of the (opens in new tab)

The PD1 AI Hackathon 2025 served as a strategic initiative by LY Corporation to embed innovative artificial intelligence directly into the LINE messaging ecosystem. Over 60 developers collaborated during an intensive 48-hour session to transition AI from a theoretical concept into practical features for messaging, content, and internal development workflows. The event successfully produced several high-utility prototypes that demonstrate how AI can enhance user safety, creative expression, and technical productivity. ## Transforming Voice Communication through NextVoIP * The "NextVoIP" project utilized Speech-to-Text (STT) technology to convert 1:1 and group call audio into real-time data for AI analysis. * The system was designed to provide life security features by detecting potential emergency situations or accidents through conversation monitoring. * AI acted as a communication assistant by suggesting relevant content and conversation topics to help maintain a seamless flow during calls. * Features were implemented to allow callers to enjoy shared digital content together, enriched by AI-driven recommendations. ## Creative Expression with MELODY LINE * This project focused on the intersection of technology and art by converting chat conversations into unique musical compositions. * The system analyzed the context and emotional sentiment of messages to automatically generate melodies that matched the atmosphere of the chat. * The implementation showcased the potential for generative AI to provide a multi-sensory experience within a standard messaging interface. ## AI-Driven QA and Test Automation * The grand prize-winning project, "IPD," addressed the bottleneck of repetitive manual testing by automating the entire Quality Assurance lifecycle. * AI was utilized to automatically generate and manage complex test cases, significantly reducing the manual effort required for mobile app validation. * The system included automated test execution and a diagnostic feature that identifies the root cause of failures when a test results in an error. * The project was specifically lauded for its immediate "production-ready" status, offering a direct path to improving development speed and software reliability. The results of this hackathon suggest that the most immediate value for AI in large-scale messaging platforms lies in two areas: enhancing user experience through contextual awareness and streamlining internal engineering via automated QA. Organizations should look toward integrating AI-driven testing tools to reduce technical debt while exploring real-time audio and text analysis to provide proactive security and engagement features for users.