vulnerability-analysis

2 posts

toss

Automating Service Vulnerability Analysis Using (opens in new tab)

Toss has developed a high-precision automated vulnerability analysis system by integrating Large Language Models (LLMs) with traditional security testing tools. By evolving their architecture from a simple prompt-based approach to a multi-agent system utilizing open-source models and static analysis, the team achieved over 95% accuracy in threat detection. This project demonstrates that moving beyond a technical proof-of-concept requires solving real-world constraints such as context window limits, output consistency, and long-term financial sustainability. ### Navigating Large Codebases with MCP * Initial attempts to use RAG (Retrieval Augmented Generation) and repository compression tools failed because the LLM could not maintain complex code relationships within token limits. * The team implemented a "SourceCode Browse MCP" (Model Context Protocol) which allows the LLM agent to dynamically query the codebase. * By indexing the code, the agent can perform specific tool calls to find function definitions or variable usages only when necessary, effectively bypassing context window restrictions. ### Ensuring Consistency via SAST Integration * Testing revealed that standalone LLMs produced inconsistent results, often missing known vulnerabilities or generating hallucinations across different runs. * To solve this, the team integrated Semgrep, a Static Application Security Testing (SAST) tool, to identify all potential "Source-to-Sink" paths. * Semgrep was chosen over CodeQL due to its lighter resource footprint and faster execution, acting as a structured roadmap that ensures the LLM analyzes every suspicious input path without omission. ### Optimizing Costs with Multi-Agent Architectures * Analyzing every possible code path identified by SAST tools was prohibitively expensive due to high token consumption. * The workflow was divided among three specialized agents: a Discovery Agent to filter out irrelevant paths, an Analysis Agent to perform deep logic checks, and a Verification Agent to confirm findings. * This "sieve" strategy ensured that the most resource-intensive analysis was only performed on high-probability vulnerabilities, significantly reducing operational costs. ### Transitioning to Open Models for Sustainability * Scaling the system to hundreds of services and daily PRs made proprietary cloud models financially unviable. * After benchmarking models like Llama 3.1 and GPT-OSS, the team selected **Qwen3:30B** for its 100% coverage rate and high true-positive accuracy in vulnerability detection. * To bridge the performance gap between open-source and proprietary models, the team utilized advanced prompt engineering, one-shot learning, and enforced structured JSON outputs to improve reliability. To build a production-ready AI security tool, teams should focus on the synergy between specialized open-source models and traditional static analysis tools. This hybrid approach provides a cost-effective and sustainable way to achieve enterprise-grade accuracy while maintaining full control over the analysis infrastructure.

line

Practical Security Knowledge Growing with (opens in new tab)

LINE CTF 2025 serves as a collaborative platform for global security experts to exchange technical knowledge and tackle real-world cybersecurity challenges through a competitive framework. Under the newly integrated LY Corporation, the event evolved to prioritize anti-AI problem design and enhanced privacy protections, reinforcing its position as a top-tier competition in the Asian security community. The event successfully demonstrated that high-quality problem engineering and community-focused operations can drive both individual growth and organizational security excellence. ## Strategic Shift and AI-Resilient Design * **Multisite Collaboration:** While previous years were led primarily by the Japanese team, 2025 saw a shift where the Korean security team led preparations and the Vietnamese team contributed the highest volume of technical challenges. * **Counter-AI Engineering:** To maintain fairness in an era of LLMs, problems were specifically designed to mislead automated AI analysis, requiring human logic and deep conceptual understanding to arrive at the correct "flag." * **Systemic Integration:** This was the first year applying the unified LY Corporation administrative and approval processes, resulting in a more refined timeline for problem verification and quality control. ## Competition Format and Problem Engineering * **Jeopardy-Style Challenges:** The event featured 13 independent challenges—6 Web, 4 Pwnable, and 3 Reverse Engineering—where teams earned points based on difficulty. * **Three-Stage Validation:** Every problem underwent a rigorous cycle of idea conception, technical environment isolation/testing, and internal peer review to eliminate unintended "cheese" solutions or bugs. * **Technical Philosophy:** Problems were modeled after real-world service vulnerabilities and latest security trends, targeting a difficulty level that requires several hours of dedicated analysis by a skilled researcher. ## Platform Evolution and Performance * **Privacy-First Infrastructure:** The team customized the open-source CTFd framework to remove email-based registration, instead using a recovery-code system to ensure participant anonymity and data security. * **Growing Technical Prestige:** The competition’s rating on CTFtime (a global community platform) has climbed steadily over three years, reaching a weight of 66.5 in 2025, reflecting its high quality and difficulty. * **Competitive Results:** The Korean team "The Duck" maintained dominance with a third consecutive win, while the battle for second place was decided by a dramatic last-minute solve by the Japanese team "GMO Ierae." Participating in CTFs like LINE CTF offers an invaluable practical learning environment for security engineers to master vulnerability analysis and exploit development. Aspiring and professional researchers are encouraged to engage with these challenges to sharpen their analytical skills and contribute to a more robust, collaborative global security culture.