woowahan

Test Automation with AI: (opens in new tab)

This blog post explores how a development team at Woowahan Tech successfully automated the creation of 100 unit tests in just 30 minutes by combining a custom IntelliJ plugin with Amazon Q. The author argues that while full AI automation often fails in complex multi-module environments, a hybrid approach using "compile-guaranteed templates" ensures high success rates and maintains operational stability. This strategy allows developers to bypass repetitive setup tasks while leveraging AI for logic implementation within a strictly defined, valid structure.

Evaluating AI Assistants for Testing

  • The team compared various AI tools including GitHub Copilot, Cursor, and Amazon Q to determine which best fit their existing IntelliJ-based workflow.
  • Amazon Q was selected for its superior understanding of the entire project context and its ability to integrate seamlessly as a plugin without requiring a switch to a new IDE.
  • Initial manual use of AI assistants highlighted repetitive patterns: developers had to constantly specify team conventions (Kotest FunSpec, MockK) and manually fix build errors in 15% of the generated code.
  • On average, it took 10 minutes per class to generate and refine tests manually, prompting the team to seek a more automated solution via a custom plugin.

The Pitfalls of Full Automation

  • The first version of the custom plugin attempted to generate complete test files by gathering class metadata through PSI (Program Structure Interface) and sending it to the Gemini API.
  • Pilot tests revealed a 90% compilation failure rate, as the AI frequently generated incorrect imports, hallucinated non-existent fields, or used mismatched data types.
  • A critical issue was the "loss of existing tests," where the AI-generated output would completely overwrite previous work rather than appending to it.
  • In complex multi-module projects, the AI struggled to identify the correct classes when multiple modules contained identical class names, leading to significant manual correction time.

Shifting to Compile-Guaranteed Templates

  • To overcome the limitations of full automation, the team pivoted to a "template first" approach where the plugin generates a valid, compilable shell for the test.
  • The plugin handles the complex infrastructure of the test file, including correct imports, MockK setups, and empty test stubs for every method in the target class.
  • This approach reduces the AI's "hallucination surface" by providing it with a predefined structure, allowing tools like Amazon Q to focus solely on filling in the implementation details.
  • By automating the 1-minute setup and letting the AI handle the 2-minute implementation phase, the team achieved a 97% success rate across 100 test cases.

Practical Conclusion

For teams looking to improve test coverage in large-scale repositories, the most effective strategy is to use IDE plugins to automate context gathering and boilerplate generation. By providing the AI with a structurally sound template, developers can eliminate compilation errors and significantly reduce the time spent on manual refinement, ensuring that even complex edge cases are covered with minimal effort.