네이버 / llm

6 posts

naver

네이버 TV (opens in new tab)

Processing complex PDF documents remains a significant bottleneck for Large Language Models (LLMs) due to the intricate layouts, nested tables, and visual charts that standard text extractors often fail to capture. To address this, NAVER developed PaLADIN, an LLM-friendly PDF parser designed to transform visual document elements into structured data that models can accurately interpret. By combining specialized vision models with advanced OCR, the system enables high-fidelity document understanding for demanding tasks like analyzing financial reports. ### Challenges in Document Intelligence * Standard PDF parsing often loses the semantic structure of the document, such as the relationship between headers and body text. * Tables and charts pose the greatest difficulty, as numerical values and trends must be extracted without losing the spatial context that defines their meaning. * A "one-size-fits-all" approach to text extraction results in "hallucinations" when LLMs attempt to reconstruct data from fragmented strings. ### The PaLADIN Architecture and Model Integration * **Element Detection:** The system utilizes `Doclayout-Yolo` to identify and categorize document components like text blocks, titles, tables, and figures. * **Table Extraction:** Visual table structures are processed through `nemoretriever-table-structure-v1`, ensuring that cell boundaries and headers are preserved. * **Chart Interpretation:** To convert visual charts into descriptive text or data, the parser employs `google/gemma3-27b-it`, allowing the LLM to "read" visual trends. * **Text Recognition:** For high-accuracy character recognition, particularly in multi-lingual contexts, the pipeline integrates NAVER’s `Papago OCR`. * **Infrastructure:** The architecture leverages `nv-ingest` for optimized throughput and speed, making it suitable for large-scale document processing. ### Evaluation and Real-world Application * **Performance Metrics:** NAVER established a dedicated parsing evaluation set to measure accuracy across diverse document types, focusing on speed and structural integrity. * **AIB Securities Reports:** The parser is currently applied to summarize complex stock market reports, where precision in numerical data is critical. * **LLM-as-a-Judge:** To ensure summary quality, the system uses an automated evaluation framework where a high-performing LLM judges the accuracy of the generated summaries against the parsed source data. For organizations building RAG (Retrieval-Augmented Generation) systems, the transition from basic text extraction to a layout-aware parsing pipeline like PaLADIN is crucial. Future improvements focusing on table cell coordinate precision and more granular chart analysis will further reduce the error rates in automated document processing.

naver

네이버 TV (opens in new tab)

The development of NSona, an LLM-based multi-agent persona platform, addresses the persistent gap between user research and service implementation by transforming static data into real-time collaborative resources. By recreating user voices through a multi-party dialogue system, the project demonstrates how AI can serve as an active participant in the daily design and development process. Ultimately, the initiative highlights a fundamental shift in cross-functional collaboration, where traditional role boundaries dissolve in favor of a shared starting point centered on AI-driven user empathy. ## Bridging UX Research and Daily Collaboration * The project was born from the realization that traditional UX research often remains isolated from the actual development cycle, leading to a loss of insight during implementation. * NSona transforms static user research data into dynamic "persona bots" that can interact with project members in real-time. * The platform aims to turn the user voice into a "live" resource, allowing designers and developers to consult the persona during the decision-making process. ## Agent-Centric Engineering and Multi-Party UX * The system architecture is built on an agent-centric structure designed to handle the complexities of specific user behaviors and motivations. * It utilizes a Multi-Party dialogue framework, enabling a collaborative environment where multiple AI agents and human stakeholders can converse simultaneously. * Technical implementation focused on bridging the gap between qualitative UX requirements and LLM orchestration, ensuring the persona's responses remained grounded in actual research data. ## Service-Specific Evaluation and Quality Metrics * The team moved beyond generic LLM benchmarks to establish a "Service-specific" evaluation process tailored to the project's unique UX goals. * Model quality was measured by how vividly and accurately it recreated the intended persona, focusing on the degree of "immersion" it triggered in human users. * Insights from these evaluations helped refine the prompt design and agent logic to ensure the AI's output provided genuine value to the product development lifecycle. ## Redefining Cross-Functional Collaboration * The AI development process reshaped traditional Roles and Responsibilities (RNR); designers became prompt engineers, while researchers translated qualitative logic into agentic structures. * Front-end developers evolved their roles to act as critical reviewers of the AI, treating the model as a subject of critique rather than a static asset. * The workflow shifted from a linear "relay" model to a concentric one, where all team members influence the product's core from the same starting point. To successfully integrate AI into the product lifecycle, organizations should move beyond using LLMs as simple tools and instead view them as a medium for interdisciplinary collaboration. By building multi-agent systems that reflect real user data, teams can ensure that the "user's voice" is not just a research summary, but a tangible participant in the development process.

naver

FE News 25년 12월 소식을 전해드립니다! (opens in new tab)

The December 2025 FE News highlights a significant shift in front-end development where the dominance of React is being cemented by LLM training cycles, even as the browser platform begins to absorb core framework functionalities. It explores the evolution of WebAssembly beyond its name and Vercel’s vision for managing distributed systems through language-level abstractions. Ultimately, the industry is moving toward a convergence of native web standards and AI-driven development paradigms that prioritize collective intelligence and simplified architectures. ### Clarifying the Identity of WebAssembly * Wasm is frequently misunderstood as a web-only assembly language, but it functions more like a platform-agnostic bytecode similar to JVM or .NET. * The name "WebAssembly" was originally a strategic choice for project funding rather than an accurate technical description of its capabilities or intended environment. ### The LLM Feedback Loop and React’s Dominance * The "dead framework theory" suggests that because LLM tools like Replit and Bolt hardcode React into system prompts, the framework has reached a state of perpetual self-reinforcement. * With over 13 million React sites deployed in the last year, new frameworks face a 12-18 month lag to be included in LLM training data, making it nearly impossible for competitors to disrupt React's current platform status. ### Vercel and the Evolution of Programming Abstractions * Vercel is integrating complex distributed system management directly into the development experience via directives like `Server Actions`, `use cache`, and `use workflow`. * These features are built on serializable closures, algebraic effects, and incremental computation, moving complexity from external libraries into the native language structure. ### Native Browser APIs vs. Third-Party Frameworks * Modern web standards, including Shadow DOM, ES Modules, and the Navigation and View Transitions APIs, are now capable of handling routing and state management natively. * This transition allows for high-performance application development with reduced bundle sizes, as the browser platform takes over responsibilities previously exclusive to heavy frameworks. ### LLM Council: Collective AI Decision Making * Andrej Karpathy’s LLM Council is a local web application that utilizes a three-stage process—independent suggestion, peer review, and final synthesis—to overcome the limitations of single AI models. * The system utilizes the OpenRouter API to combine the strengths of various models, such as GPT-5.1 and Claude Sonnet 4.5, using a stack built on Python (FastAPI) and React with Vite. Developers should focus on mastering native browser APIs as they become more capable while recognizing that React’s ecosystem remains the most robust choice for AI-integrated workflows. Additionally, exploring multi-model consensus systems like the LLM Council can provide more reliable results for complex technical decision-making than relying on a single AI provider.

naver

네이버 TV (opens in new tab)

Naver’s Integrated Search team is transitioning from manual fault response to an automated system using LLM Agents to manage the increasing complexity of search infrastructure. By integrating Large Language Models into the DevOps pipeline, the system evolves through accumulated experience, moving beyond simple alert monitoring to intelligent diagnostic analysis and action recommendation. ### Limitations of Traditional Fault Response * **Complex Search Flows:** Naver’s search architecture involves multiple interdependent layers, which makes manual root cause analysis slow and prone to human error. * **Fragmented Context:** Existing monitoring requires developers to manually synthesize logs and metrics from disparate telemetry sources, leading to high cognitive load during outages. * **Delayed Intervention:** Human-led responses often suffer from a "detection-to-action" lag, especially during high-traffic periods or subtle service regressions. ### Architecture of DevOps Agent v1 * **Initial Design:** Focused on automating basic data gathering and providing preliminary textual reports to engineers. * **Infrastructure Integration:** Built using a specialized software stack designed to bridge frontend (FE) and backend (BE) telemetry within the search infrastructure. * **Standardized Logic:** The v1 agent operated on a fixed set of instructions to perform predefined diagnostic tasks when triggered by specific system alarms. ### Evolution to DevOps Agent v2 * **Overcoming V1 Limitations:** The first iteration struggled with maintaining deep context and providing diverse actionable insights, necessitating a more robust agentic structure. * **Enhanced Memory and Learning:** V2 incorporates a more sophisticated architecture that allows the agent to reference historical failure data and learn from past incident resolutions. * **Advanced Tool Interaction:** The system was upgraded to handle more complex tool-calling capabilities, allowing the agent to interact more deeply with internal infrastructure APIs. ### System Operations and Evaluation * **Trigger Queue Management:** Implements a queuing system to efficiently process and prioritize multiple concurrent system alerts without overwhelming the diagnostic pipeline. * **Anomaly Detection:** Utilizes advanced detection methods to distinguish between routine traffic fluctuations and genuine service anomalies that require LLM intervention. * **Rigorous Evaluation:** The agent’s performance is measured through a dedicated evaluation framework that assesses the accuracy of its diagnoses against known ground-truth incidents. ### Scaling and Future Challenges * **Context Expansion:** Efforts are focused on integrating a wider range of metadata and environmental context to provide a holistic view of system health. * **Action Recommendation:** The system is moving toward suggesting specific recovery actions, such as rollbacks or traffic rerouting, rather than just identifying the problem. * **Sustainability:** Ensuring the DevOps Agent remains maintainable and cost-effective as the underlying search infrastructure and LLM models continue to evolve. Organizations managing high-scale search traffic should consider LLM-based agents as integrated infrastructure components rather than standalone tools. Moving from reactive monitoring to a proactive, experience-based agent system is essential for reducing the mean time to recovery (MTTR) in complex distributed environments.

naver

[DAN25] 기술세션 영상이 모두 공개되었습니다. (opens in new tab)

Naver recently released the full video archives from its DAN25 conference, highlighting the company’s strategic roadmap for AI agents, Sovereign AI, and digital transformation. The sessions showcase how Naver is moving beyond general AI applications to implement specialized, real-time systems that integrate large language models (LLMs) directly into core services like search, commerce, and content. By open-sourcing these technical insights, Naver demonstrates its progress in building a cohesive AI ecosystem capable of handling massive scale and complex user intent. ### Naver PersonA and LLM-Based User Memory * The "PersonA" project focuses on building a "user memory" by treating fragmented logs across various Naver services as indirect conversations with the user. * By leveraging LLM reasoning, the system transitions from simple data tracking to a sophisticated AI agent that offers context-aware, real-time suggestions. * Technical hurdles addressed include the stable implementation of real-time log reflection for a massive user base and the selection of optimal LLM architectures for personalized inference. ### Trend Analysis and Search-Optimized Models * The Place Trend Analysis system utilizes ranking algorithms to distinguish between temporary surges and sustained popularity, providing a balanced view of "hot places." * LLMs and text mining are employed to move beyond raw data, extracting specific keywords that explain the underlying reasons for a location's trending status. * To improve search quality, Naver developed search-specific LLMs that outperform general models by using specialized data "recipes" and integrating traditional information retrieval with features like "AI briefing" and "AuthGR" for higher reliability. ### Unified Recommendation and Real-Time CRM * Naver Webtoon and Series replaced fragmented recommendation and CRM (Customer Relationship Management) models with a single, unified framework to ensure data consistency. * The architecture shifted from batch-based processing to a real-time, API-based serving system to reduce management complexity and improve the immediacy of personalized user experiences. * This transition focuses on maintaining a seamless UX by synchronizing different ML models under a unified serving logic. ### Scalable Log Pipelines and Infrastructure Stability * The "Logiss" pipeline manages up to tens of billions of logs daily, utilizing a Storm and Kafka environment to ensure high availability and performance. * Engineers implemented a multi-topology approach to allow for seamless, non-disruptive deployments even under heavy loads. * Intelligent features such as "peak-shaving" (distributing peak traffic to off-peak hours), priority-based processing during failures, and efficient data sampling help balance cost, performance, and stability. These sessions provide a practical blueprint for organizations aiming to scale LLM-driven services while maintaining infrastructure integrity. For developers and system architects, Naver’s transition toward unified ML frameworks and specialized, real-time data pipelines offers a proven model for moving AI from experimental phases into high-traffic production environments.

naver

네이버 TV (opens in new tab)

This session from NAVER Engineering Day 2025 explores how developers can transition AI from a simple assistant into a functional project collaborator through local automation. By leveraging local Large Language Models (LLMs) and the Model Context Protocol (MCP), development teams can automate high-friction tasks such as build failure diagnostics and crash log analysis. The presentation demonstrates that integrating these tools directly into the development pipeline significantly reduces the manual overhead required for routine troubleshooting and reporting. ### Integrating LLMs with Local Environments * Utilizing **Ollama** allows teams to run LLMs locally, ensuring data privacy and reducing latency compared to cloud-based alternatives. * The **mcp-agent** (Model Context Protocol) serves as the critical bridge, connecting the LLM to local file systems, tools, and project-specific data. * This infrastructure enables the AI to act as an "agent" that can autonomously navigate the codebase rather than just processing static text prompts. ### Build Failure and Crash Monitoring Automation * When a build fails, the AI agent automatically parses the logs to identify the root cause, providing a concise summary instead of requiring a developer to sift through thousands of lines of terminal output. * For crash monitoring, the system goes beyond simple summarization by analyzing stack traces and identifying the specific developer or team responsible for the affected code segment. * By automating the initial diagnostic phase, the time between an error occurring and a developer beginning the fix is dramatically shortened. ### Intelligent Reporting via Slack * The system integrates with **Slack** to deliver automated, context-aware reports that categorize issues by severity and impact. * These reports include actionable insights, such as suggested fixes or links to relevant documentation, directly within the communication channel used by the team. * This ensures that project stakeholders remain informed of the system's health without requiring manual status updates from engineers. ### Considerations for LLM and MCP Implementation * While powerful, the combination of LLMs and MCP agents is not a "silver bullet"; it requires careful prompt engineering and boundary setting to prevent hallucination in technical diagnostics. * Effective automation depends on the quality of the local context provided to the agent; the more structured the logs and metadata, the more accurate the AI's conclusions. * Organizations should evaluate the balance between the computational cost of running local models and the productivity gains achieved through automation. To successfully implement AI-driven automation, developers should start by targeting specific, repetitive bottlenecks—such as triaging build errors—before expanding the agent's scope to more complex architectural tasks. Focusing on the integration between Ollama and mcp-agent provides a secure, extensible foundation for building a truly "smart" development workflow.