woowahan Dec 4, 2025

Test Automation with AI: (opens in new tab)

ai kotlin ai-agent test-automation amazon-q intellij-plugin kotest mockk unit-testing

This blog post explores how a development team at Woowahan Tech successfully automated the creation of 100 unit tests in just 30 minutes by combining a custom IntelliJ plugin with Amazon Q. The author argues that while full AI automation often fails in complex multi-module environments, a hybrid approach using "compile-guaranteed templates" ensures high success rates and maintains operational stability. This strategy allows developers to bypass repetitive setup tasks while leveraging AI for logic implementation within a strictly defined, valid structure.

Evaluating AI Assistants for Testing

The team compared various AI tools including GitHub Copilot, Cursor, and Amazon Q to determine which best fit their existing IntelliJ-based workflow.
Amazon Q was selected for its superior understanding of the entire project context and its ability to integrate seamlessly as a plugin without requiring a switch to a new IDE.
Initial manual use of AI assistants highlighted repetitive patterns: developers had to constantly specify team conventions (Kotest FunSpec, MockK) and manually fix build errors in 15% of the generated code.
On average, it took 10 minutes per class to generate and refine tests manually, prompting the team to seek a more automated solution via a custom plugin.

The Pitfalls of Full Automation

The first version of the custom plugin attempted to generate complete test files by gathering class metadata through PSI (Program Structure Interface) and sending it to the Gemini API.
Pilot tests revealed a 90% compilation failure rate, as the AI frequently generated incorrect imports, hallucinated non-existent fields, or used mismatched data types.
A critical issue was the "loss of existing tests," where the AI-generated output would completely overwrite previous work rather than appending to it.
In complex multi-module projects, the AI struggled to identify the correct classes when multiple modules contained identical class names, leading to significant manual correction time.

Shifting to Compile-Guaranteed Templates

To overcome the limitations of full automation, the team pivoted to a "template first" approach where the plugin generates a valid, compilable shell for the test.
The plugin handles the complex infrastructure of the test file, including correct imports, MockK setups, and empty test stubs for every method in the target class.
This approach reduces the AI's "hallucination surface" by providing it with a predefined structure, allowing tools like Amazon Q to focus solely on filling in the implementation details.
By automating the 1-minute setup and letting the AI handle the 2-minute implementation phase, the team achieved a 97% success rate across 100 test cases.

Practical Conclusion

For teams looking to improve test coverage in large-scale repositories, the most effective strategy is to use IDE plugins to automate context gathering and boilerplate generation. By providing the AI with a structurally sound template, developers can eliminate compilation errors and significantly reduce the time spent on manual refinement, ensuring that even complex edge cases are covered with minimal effort.