네이버

23 posts

d2.naver.com

Filter by tag

naver

Smart Store Center's (opens in new tab)

Smart Store Center successfully migrated its legacy platform from Oracle to MySQL to overcome performance instability caused by resource contention and to reduce high licensing costs. By implementing a "dual write" strategy, the team achieved a zero-downtime transition while maintaining the ability to roll back immediately without data loss. This technical journey highlights the use of proxy data sources and transaction synchronization to ensure data integrity across disparate database environments. ## Zero-Downtime Migration via Dual Writing * The migration strategy relied on "dual writing," where all Create, Update, and Delete (CUD) operations are performed on both the legacy Oracle and the new MySQL databases. * In the pre-migration phase, Oracle served as the primary source for all traffic while MySQL recorded writes in the background to build a synchronized state. * Once data was fully migrated and verified, the primary traffic was shifted to MySQL, with background writes continuing to Oracle to allow for an instantaneous rollback if performance issues occurred. * This approach decoupled the database switch from application deployment, providing a safety net against critical failures that a simple redeploy could not fix. ## Technical Implementation for JPA * To capture and replicate queries, the team utilized the `datasource-proxy` library, which allowed them to intercept Oracle queries and execute them against a separate MySQL DataSource. * To prevent MySQL write failures from impacting the primary Oracle transactions, writes to the secondary database were managed using `TransactionSynchronizationManager`. * By executing MySQL queries during the `afterCommit` phase, the team ensured that the primary service remained stable even if the secondary database encountered errors or performance bottlenecks. * The transition required modifying JPA Entity configurations, such as changing primary key generation from Oracle Sequences to MySQL’s `IDENTITY` (auto-increment) and adjusting `columnDefinition` for types like `text`, `longtext`, and `decimal`. ## Centralized MyBatis Strategy * To avoid modifying thousands of business logic points in a 10-year-old codebase, the team sought a way to implement dual writing for MyBatis at the architectural level. * The implementation focused on the MyBatis `Configuration` and `MappedStatement` objects to capture SQL execution without requiring manual updates to individual repository interfaces. * This centralized approach maintained the purity of the business logic and ensured that the dual-write logic could be easily removed once the migration was fully stabilized. For organizations managing large-scale legacy migrations, the dual-write pattern combined with asynchronous transaction synchronization is a highly recommended safety mechanism. Prioritizing the isolation of secondary database failures ensures that the user experience remains unaffected while technical validation is performed in real-time.

naver

Analysis of Naver Integrated (opens in new tab)

The integration of AI Briefing (AIB) into Naver Search has led to a noticeable increase in Largest Contentful Paint (LCP) values, with p95 metrics rising to approximately 3.1 seconds. This shift is primarily driven by the architectural mismatch between traditional performance metrics and the dynamic, streaming nature of AI chat interfaces. The analysis concludes that while AIB appears to degrade performance on paper, the delay is largely a result of how browsers measure rendering in incremental UI patterns. ### Impact of AIB on Search Performance * Since the introduction of AIB’s chat-based UI in July 2025, LCP p95 has moved beyond the 2.5-second target, showing a direct correlation with AIB traffic volume. * The performance degradation is characterized by a "tail" effect, where a higher percentage of users fall into slower LCP buckets despite stable server response times. * Unlike Google’s AI Overview, which renders in larger blocks, Naver’s AIB uses word-by-word animations and frequent UI updates that place a heavier burden on the browser's rendering engine. ### Client-Side Rendering Bottlenecks * Performance profiling indicates that the delay is localized to the client-side rendering phase rather than the network or server. * Initial rendering includes a skeleton UI period of roughly 900ms, followed by sequential text animations that push the final paint time back. * Comparative data shows that when AIB is the LCP candidate, the p75 value reaches 4.5 seconds—significantly slower than other heavy components like map modules. ### Structural Misalignment with LCP Measurement * **DOM Reconstruction:** After text animations finish, AIB rebuilds the DOM to enable citation highlighting and hover interactions, which triggers Chromium to update the LCP timestamp to this much later point. * **Candidate Fragmentation:** Streaming text at the word level prevents the browser from identifying a single large text block; instead, small, insignificant fragments are often incorrectly selected as the LCP candidate. * **Paint Invalidation:** Chromium’s rendering pipeline treats every new word in a streaming response as a layer update, causing repeated paint invalidations that push the `renderTime` forward frame-by-frame until the entire message is complete. ### New Metrics for AI-Driven Interfaces * To more accurately reflect user experience, Naver is shifting toward Time to First Token (TTFT) as a primary metric for AIB, focusing on how quickly the first meaningful response appears. * Standard LCP remains a valid quality indicator for static search results, but it is no longer treated as a universal benchmark for interactive AI components. * Future performance management will involve more granular distribution analysis and "predictive" performance modeling rather than simply optimizing for a single threshold like the 2.5-second LCP mark. To effectively manage performance in the era of generative AI, organizations should move away from relying solely on LCP for streaming interfaces. Implementing TTFT as a complementary metric provides a better representation of perceived speed, while optimizing the timing of DOM reconstructions can prevent unnecessary measurement delays in Chromium-based browsers.

naver

FE News: January 2 (opens in new tab)

The January 2026 FE News highlights a significant shift toward client-side intelligence and deeper architectural transparency in modern web development. By exploring advanced visualization tools for React Server Components and the integration of AI within design systems and on-device environments, the industry is moving toward more automated and efficient frontend workflows. This collection underscores how foundational technologies like WebGPU and standardized design tokens are becoming essential for building the next generation of AI-driven user experiences. ### Visualizing React Server Components * Dan Abramov’s RSC Explorer allows developers to step through and decompose the RSC protocol stream directly within the browser. * The tool features four specialized panels—Server, Client, Flight, and Preview—to visualize the complete data flow and protocol structure. * It utilizes React's native reader/writer to ensure the output matches actual protocol behavior, making it an ideal resource for debugging streaming (Suspense), Client References, Server Actions, and Router refreshes. ### The Rise of Client-Side AI and Agents * The Web AI Summit 2025 highlights a transition from server-dependent AI to local, browser-based execution using Transformers.js for 100% local ML model processing. * New frameworks like webMCP allow developers to define site functions as tools that can be consumed by browser-based AI agents, fostering a more interactive agent-based UX. * Technical advancements in Wasm, WebGPU, and WebNN are facilitating high-performance on-device inference, enabling developers to build complex AI features without heavy reliance on backend APIs. ### AI Research and Development Milestones * Google’s Jeff Dean provides insights into AI trends that influence not just individual features, but the underlying system architecture and data workflows of modern products. * "The Thinking Game," a documentary covering five years of DeepMind's history, chronicles the team's pursuit of Artificial General Intelligence (AGI) and the development of AlphaFold. * These resources suggest that frontend developers should view AI as a structural change to product design rather than a simple functional add-on. ### Automating Markup with Design Systems * Naver Financial has shared practical results of using Figma Code Connect and specific AI instructions to automate component-based markup generation. * The experiment proved that training AI on standardized design tokens and component structures allows for the generation of frontend code that is ready for immediate development. * However, complex layouts and responsive design still require human intervention, reinforcing the idea that the efficiency of AI automation is directly tied to the quality of design system documentation and standardization. Frontend developers should prioritize mastering client-side AI technologies and visualization tools to stay ahead of architectural shifts. As AI becomes more integrated into the development lifecycle, maintaining highly standardized design systems and understanding internal framework protocols like RSC will be the primary drivers of professional productivity.

naver

Implementing an Intelligent Log Pipeline Focused on Cost (opens in new tab)

Naver’s Logiss platform, responsible for processing tens of billions of daily logs, evolved its architecture to overcome systemic inefficiencies in resource utilization and deployment stability. By transitioning from a rigid, single-topology structure to an intelligent, multi-topology pipeline, the team achieved zero-downtime deployments and optimized infrastructure costs. These enhancements ensure that critical business data is prioritized during traffic surges while minimizing redundant storage for search-optimized indices. ### Limitations of the Legacy Pipeline * **Deployment Disruptions:** The previous single-topology setup in Apache Storm lacked a "swap" feature, requiring a total shutdown for updates and causing 3–8 minute processing lags during every deployment. * **Resource Inefficiency:** Infrastructure was provisioned based on daytime peak loads, which are five times higher than nighttime traffic, resulting in significant underutilization during off-peak hours. * **Indiscriminate Processing:** During traffic spikes or hardware failures, the system treated all logs equally, causing critical service logs to be delayed alongside low-priority telemetry. * **Storage Redundancy:** Data was stored at 100% volume in both real-time search (OpenSearch) and long-term storage (Landing Zones), even when sampled data would have sufficed for search purposes. ### Transitioning to Multi-Topology and Subscribe Mode * **Custom Storm Client:** The team modified `storm-kafka-client` 2.3.0 to revert from the default `assign` mode back to the `subscribe` mode for Kafka partition management. * **Partition Rebalancing:** While `assign` mode is standard in Storm 2.x, it prevents multiple topologies from sharing a consumer group without duplication; the custom `subscribe` implementation allows Kafka to manage rebalancing across multiple topologies. * **Zero-Downtime Deployments:** This architectural shift enables rolling updates and canary deployments by allowing new topologies to join the consumer group and take over partitions without stopping the entire pipeline. ### Intelligent Traffic Steering and Sampling * **Dynamic Throughput Control:** The "Traffic-Controller" (Storm topology) monitors downstream load and diverts excess non-critical traffic to a secondary "retry" path, protecting the stability of the main pipeline. * **Tiered Log Prioritization:** The system identifies critical business logs to ensure they bypass bottlenecks, while less urgent logs are queued for post-processing during traffic surges. * **Storage Optimization via Sampling:** Logiss now supports per-destination sampling rates, allowing the system to send 100% of data to long-term Landing Zones while only indexing a representative sample in OpenSearch, significantly reducing indexing overhead and storage costs. ### Results and Recommendations The implementation of an intelligent log pipeline demonstrates that modifying core open-source components, such as the Storm-Kafka client, can be a viable path to achieving specific architectural goals like zero-downtime deployment. For high-volume platforms, moving away from a "one-size-fits-all" processing model toward a priority-aware and sampling-capable pipeline is essential for balancing operational costs with system reliability. Organizations should evaluate whether their real-time search requirements truly necessitate 100% data ingestion or if sampling can provide the necessary insights at a fraction of the cost.

naver

[Internship] Introducing (opens in new tab)

The 2026 NAVER AI CHALLENGE offers undergraduate and graduate students a four-week intensive internship to tackle real-world AI challenges alongside Naver’s senior engineers. This program focuses on the full development lifecycle, allowing participants to experience Naver’s unique collaborative culture while moving from initial idea design to technical verification. By integrating interns directly into professional workflows, the challenge aims to foster the next generation of AI talent through hands-on industrial problem-solving. ### Application Timeline and Eligibility * The internship is open to all students currently enrolled in bachelor's or master's programs, regardless of their major or year of study. * Applications are accepted from December 10 until December 16 at 11:00 AM. * The selection process consists of document screening in late December, followed by technical interviews in early January, with final announcements scheduled for mid-January. * The official internship period runs for four weeks, from January 19 to February 13, 2026. ### AI Project Specializations * **Data Pipeline and Lineage:** One track focuses on AI-driven data pipeline log analysis to automate Data Asset mapping and construct comprehensive End-to-End Data Lineage systems. * **VLM-based Evaluation Systems:** The second track involves developing automated systems using Vision Language Models (VLM) to evaluate search and recommendation quality from a user-experience perspective. * **Technical Mentorship:** Participants work directly with Naver engineers to determine technical directions and validate their AI models within a professional production context. ### Support and Work Environment * Naver provides a dedicated workspace and the latest OA equipment to ensure a seamless development experience for all interns. * Participants receive project activity funds to support their research and development efforts during the four-week duration. * The program emphasizes networking and professional growth by providing direct access to the infrastructure and expertise used by one of Korea's leading tech companies. For students looking to transition academic AI knowledge into industrial-scale applications, this internship provides a high-impact environment to build professional-grade systems. Interested candidates should finalize their project selection between data pipeline automation and VLM evaluation before the December 16 deadline to secure a spot in the selection process.

naver

When Design Systems Meet AI: Changes in (opens in new tab)

The integration of AI into the frontend development workflow is transforming how markup is generated, shifting the developer's role from manual coding to system orchestration. By leveraging Naver Financial’s robust design system—comprised of standardized design tokens and components—developers can use AI to automate the translation of Figma designs into functional code. This evolution suggests a future where the efficiency of UI implementation is dictated by the maturity of the underlying design system and the precision of AI instructions. ### Foundations of the Naver Financial Design System * The system is built on "Design Tokens," which serve as the smallest units of design, such as colors, typography, and spacing, ensuring consistency across all platforms. * Pre-defined components act as the primary building blocks for the UI, allowing the AI to reference established patterns rather than generating arbitrary styles. * The philosophy of "knowing your system" is emphasized as a prerequisite; AI effectiveness is directly proportional to how well-structured the design assets and code libraries are. ### Automating Markup with Code Connect and AI * Figma's "Code Connect" is utilized to bridge the gap between design files and the actual codebase, providing a source of truth for how components should be implemented. * Specific "Instructions" or prompts are developed to guide the AI in mapping Figma properties to specific React component props and design system logic. * This approach enables the transition from "drawing" UI to "declaring" it, where the AI interprets the design intent and outputs code that adheres to the organization’s technical standards. ### Challenges and Limitations in Real-World Development * While AI-generated markup provides a strong starting point, it often requires manual intervention for complex business logic, state management, and edge-case handling. * Maintaining the "Instruction" set requires ongoing effort to ensure the AI stays updated with the latest changes in the component library. * Developers must transition into a "reviewer" role, as the AI can still struggle with the specific context of a feature or integration with legacy code structures. The path to fully automated frontend development requires a highly mature design system as its backbone. For teams looking to adopt this paradigm, the priority should be standardizing design tokens and component interfaces; only then can AI effectively reduce the "last mile" of markup work and allow developers to focus on higher-level architectural challenges.

naver

I'm an LL (opens in new tab)

Processing complex PDF documents remains a significant bottleneck for Large Language Models (LLMs) due to the intricate layouts, nested tables, and visual charts that standard text extractors often fail to capture. To address this, NAVER developed PaLADIN, an LLM-friendly PDF parser designed to transform visual document elements into structured data that models can accurately interpret. By combining specialized vision models with advanced OCR, the system enables high-fidelity document understanding for demanding tasks like analyzing financial reports. ### Challenges in Document Intelligence * Standard PDF parsing often loses the semantic structure of the document, such as the relationship between headers and body text. * Tables and charts pose the greatest difficulty, as numerical values and trends must be extracted without losing the spatial context that defines their meaning. * A "one-size-fits-all" approach to text extraction results in "hallucinations" when LLMs attempt to reconstruct data from fragmented strings. ### The PaLADIN Architecture and Model Integration * **Element Detection:** The system utilizes `Doclayout-Yolo` to identify and categorize document components like text blocks, titles, tables, and figures. * **Table Extraction:** Visual table structures are processed through `nemoretriever-table-structure-v1`, ensuring that cell boundaries and headers are preserved. * **Chart Interpretation:** To convert visual charts into descriptive text or data, the parser employs `google/gemma3-27b-it`, allowing the LLM to "read" visual trends. * **Text Recognition:** For high-accuracy character recognition, particularly in multi-lingual contexts, the pipeline integrates NAVER’s `Papago OCR`. * **Infrastructure:** The architecture leverages `nv-ingest` for optimized throughput and speed, making it suitable for large-scale document processing. ### Evaluation and Real-world Application * **Performance Metrics:** NAVER established a dedicated parsing evaluation set to measure accuracy across diverse document types, focusing on speed and structural integrity. * **AIB Securities Reports:** The parser is currently applied to summarize complex stock market reports, where precision in numerical data is critical. * **LLM-as-a-Judge:** To ensure summary quality, the system uses an automated evaluation framework where a high-performing LLM judges the accuracy of the generated summaries against the parsed source data. For organizations building RAG (Retrieval-Augmented Generation) systems, the transition from basic text extraction to a layout-aware parsing pipeline like PaLADIN is crucial. Future improvements focusing on table cell coordinate precision and more granular chart analysis will further reduce the error rates in automated document processing.

naver

VLOps:Event-driven MLOps & Omni-Evaluator (opens in new tab)

Naver’s VLOps framework introduces an event-driven approach to MLOps, designed to overcome the rigidity of traditional pipeline-based systems like Kubeflow. By shifting from a monolithic pipeline structure to a system governed by autonomous sensors and typed messages, Naver has achieved a highly decoupled and scalable environment for multimodal AI development. This architecture allows for seamless functional expansion and cross-cloud compatibility, ultimately simplifying the transition from model training to large-scale evaluation and deployment. ### Event-Driven MLOps Architecture * Operations such as training, evaluation, and deployment are defined as "Typed Messages," which serve as the primary units of communication within the system. * An "Event Sensor" acts as the core logic hub, autonomously detecting these messages and triggering the corresponding tasks without requiring a predefined, end-to-end pipeline. * The system eliminates the need for complex version management of entire pipelines, as new features can be integrated simply by adding new message types. * This approach ensures loose coupling between evaluation and deployment systems, facilitating easier maintenance and infrastructure flexibility. ### Omni-Evaluator and Unified Benchmarking * The Omni-Evaluator serves as a centralized platform that integrates various evaluation engines and benchmarks into a single workflow. * It supports real-time monitoring of model performance, allowing researchers to track progress during the training and validation phases. * The system is designed specifically to handle the complexities of Multimodal LLMs, providing a standardized environment for diverse testing scenarios. * User-driven triggers are supported, enabling developers to initiate specific evaluation cycles manually when necessary. ### VLOps Dashboard and User Experience * The VLOps Dashboard acts as a central hub where users can manage the entire ML lifecycle without needing deep knowledge of the underlying orchestration logic. * Users can trigger complex pipelines simply by issuing a message, abstracting the technical difficulties of cloud infrastructure. * The dashboard provides a visual interface for monitoring events, message flows, and evaluation results, improving overall transparency for data scientists and researchers. For organizations managing large-scale multimodal models, moving toward an event-driven architecture is highly recommended. This model reduces the overhead of maintaining rigid pipelines and allows engineering teams to focus on model quality rather than infrastructure orchestration.

naver

Recreating the User's (opens in new tab)

The development of NSona, an LLM-based multi-agent persona platform, addresses the persistent gap between user research and service implementation by transforming static data into real-time collaborative resources. By recreating user voices through a multi-party dialogue system, the project demonstrates how AI can serve as an active participant in the daily design and development process. Ultimately, the initiative highlights a fundamental shift in cross-functional collaboration, where traditional role boundaries dissolve in favor of a shared starting point centered on AI-driven user empathy. ## Bridging UX Research and Daily Collaboration * The project was born from the realization that traditional UX research often remains isolated from the actual development cycle, leading to a loss of insight during implementation. * NSona transforms static user research data into dynamic "persona bots" that can interact with project members in real-time. * The platform aims to turn the user voice into a "live" resource, allowing designers and developers to consult the persona during the decision-making process. ## Agent-Centric Engineering and Multi-Party UX * The system architecture is built on an agent-centric structure designed to handle the complexities of specific user behaviors and motivations. * It utilizes a Multi-Party dialogue framework, enabling a collaborative environment where multiple AI agents and human stakeholders can converse simultaneously. * Technical implementation focused on bridging the gap between qualitative UX requirements and LLM orchestration, ensuring the persona's responses remained grounded in actual research data. ## Service-Specific Evaluation and Quality Metrics * The team moved beyond generic LLM benchmarks to establish a "Service-specific" evaluation process tailored to the project's unique UX goals. * Model quality was measured by how vividly and accurately it recreated the intended persona, focusing on the degree of "immersion" it triggered in human users. * Insights from these evaluations helped refine the prompt design and agent logic to ensure the AI's output provided genuine value to the product development lifecycle. ## Redefining Cross-Functional Collaboration * The AI development process reshaped traditional Roles and Responsibilities (RNR); designers became prompt engineers, while researchers translated qualitative logic into agentic structures. * Front-end developers evolved their roles to act as critical reviewers of the AI, treating the model as a subject of critique rather than a static asset. * The workflow shifted from a linear "relay" model to a concentric one, where all team members influence the product's core from the same starting point. To successfully integrate AI into the product lifecycle, organizations should move beyond using LLMs as simple tools and instead view them as a medium for interdisciplinary collaboration. By building multi-agent systems that reflect real user data, teams can ensure that the "user's voice" is not just a research summary, but a tangible participant in the development process.

naver

FE News - December 202 (opens in new tab)

The December 2025 FE News highlights a significant shift in front-end development where the dominance of React is being cemented by LLM training cycles, even as the browser platform begins to absorb core framework functionalities. It explores the evolution of WebAssembly beyond its name and Vercel’s vision for managing distributed systems through language-level abstractions. Ultimately, the industry is moving toward a convergence of native web standards and AI-driven development paradigms that prioritize collective intelligence and simplified architectures. ### Clarifying the Identity of WebAssembly * Wasm is frequently misunderstood as a web-only assembly language, but it functions more like a platform-agnostic bytecode similar to JVM or .NET. * The name "WebAssembly" was originally a strategic choice for project funding rather than an accurate technical description of its capabilities or intended environment. ### The LLM Feedback Loop and React’s Dominance * The "dead framework theory" suggests that because LLM tools like Replit and Bolt hardcode React into system prompts, the framework has reached a state of perpetual self-reinforcement. * With over 13 million React sites deployed in the last year, new frameworks face a 12-18 month lag to be included in LLM training data, making it nearly impossible for competitors to disrupt React's current platform status. ### Vercel and the Evolution of Programming Abstractions * Vercel is integrating complex distributed system management directly into the development experience via directives like `Server Actions`, `use cache`, and `use workflow`. * These features are built on serializable closures, algebraic effects, and incremental computation, moving complexity from external libraries into the native language structure. ### Native Browser APIs vs. Third-Party Frameworks * Modern web standards, including Shadow DOM, ES Modules, and the Navigation and View Transitions APIs, are now capable of handling routing and state management natively. * This transition allows for high-performance application development with reduced bundle sizes, as the browser platform takes over responsibilities previously exclusive to heavy frameworks. ### LLM Council: Collective AI Decision Making * Andrej Karpathy’s LLM Council is a local web application that utilizes a three-stage process—independent suggestion, peer review, and final synthesis—to overcome the limitations of single AI models. * The system utilizes the OpenRouter API to combine the strengths of various models, such as GPT-5.1 and Claude Sonnet 4.5, using a stack built on Python (FastAPI) and React with Vite. Developers should focus on mastering native browser APIs as they become more capable while recognizing that React’s ecosystem remains the most robust choice for AI-integrated workflows. Additionally, exploring multi-model consensus systems like the LLM Council can provide more reliable results for complex technical decision-making than relying on a single AI provider.

naver

Research for the Protection of the Web (opens in new tab)

Naver Webtoon is proactively developing technical solutions to safeguard its digital creation ecosystem against evolving threats like illegal distribution and unauthorized generative AI training. By integrating advanced AI-based watermarking and protective perturbation technologies, the platform successfully tracks content leaks and disrupts unauthorized model fine-tuning. These efforts ensure a sustainable environment where creators can maintain the integrity and economic value of their intellectual property. ## Challenges in the Digital Creation Ecosystem - **Illegal Content Leakage**: Unauthorized reproduction and distribution of digital content infringe on creator earnings and damage the platform's business model. - **Unauthorized Generative AI Training**: The rise of fine-tuning techniques (e.g., LoRA, Dreambooth) allows for the unauthorized mimicry of an artist's unique style, distorting the value of original works. - **Harmful UGC Uploads**: The presence of violent or suggestive user-generated content increases operational costs and degrades the service experience for readers. ## AI-Based Watermarking for Post-Tracking - To facilitate tracking in DRM-free environments, Naver Webtoon developed an AI-based watermarking system that embeds invisible signals into the pixels of digital images. - The system is designed around three conflicting requirements: **Invisibility** (signal remains hidden), **Robustness** (signal survives attacks like cropping or compression), and **Capacity** (sufficient data for tracking). - The technical pipeline involves three neural modules: an **Embedder** to insert the signal, a differentiable **Attack Layer** to simulate real-world distortions, and an **Extractor** to recover the signal. - Performance metrics show a high Peak Signal-to-Noise Ratio (PSNR) of over 46 dB, and the system maintains a signal error rate of less than 1% even when subjected to intense signal processing or geometric editing. ## IMPASTO: Disrupting Unauthorized AI Training - This technology utilizes **protective perturbation**, which adds microscopic changes to images that are invisible to humans but confuse generative AI models during the training phase. - It targets the way diffusion models (like Stable Diffusion) learn by either manipulating latent representations or disrupting the denoising process, preventing the AI from accurately mimicking an artist's style. - The research prioritizes overcoming the visual artifacts and slow processing speeds found in existing academic tools like Glaze and PhotoGuard. - By implementing these perturbations, any attempts to fine-tune a model on protected work will result in distorted or unintended outputs, effectively shielding the artist's original style. ## Integrated Protection Frameworks - **TOONRADAR**: A comprehensive system deployed since 2017 that uses watermarking for both proactive blocking and retrospective tracking of illegal distributors. - **XPIDER**: An automated detection tool tailored specifically for the comic domain to identify and block harmful UGC, reducing manual inspection overhead. - These solutions are being expanded not just for copyright protection, but to establish long-term trust and reliability in the era of AI-generated content. The deployment of these AI-driven defense mechanisms is essential for maintaining a fair creative economy. By balancing visual quality with robust protection, platforms can empower creators to share their work globally without the constant fear of digital theft or stylistic mimicry.

naver

Iceberg Low-Latency Queries with Materialized Views (opens in new tab)

This technical session from NAVER ENGINEERING DAY 2025 explores the architectural journey of building a low-latency query system for real-time transaction reports. The project focuses on resolving the tension between high data freshness, massive scalability, and rapid response times for complex, multi-dimensional filtering. By leveraging Apache Iceberg in conjunction with StarRocks’ materialized views, the team established a performant data pipeline that meets the demands of modern business intelligence. ### Challenges in Real-Time Transaction Reporting * **Query Latency vs. Data Freshness:** Traditional architectures often struggle to provide immediate visibility into transaction data while maintaining sub-second query speeds across diverse filter conditions. * **High-Dimensional Filtering:** Users require the ability to query reports based on numerous variables, necessitating an engine that can handle complex aggregations without pre-defining every possible index. * **Scalability Requirements:** The system must handle increasing transaction volumes without degrading performance or requiring significant manual intervention in the underlying storage layer. ### Optimized Architecture with Iceberg and StarRocks * **Apache Iceberg Integration:** Iceberg serves as the open table format, providing a reliable foundation for managing large-scale data snapshots and ensuring consistency during concurrent reads and writes. * **StarRocks for Query Acceleration:** The team selected StarRocks as the primary OLAP engine to take advantage of its high-speed vectorized execution and native support for Iceberg tables. * **Spark-Based Processing:** Apache Spark is utilized for the initial data ingestion and transformation phases, preparing the transaction data for efficient storage and downstream consumption. ### Enhancing Performance via Materialized Views * **Pre-computed Aggregations:** By implementing Materialized Views, the system pre-calculates intensive transaction summaries, significantly reducing the computational load during active user queries. * **Automatic Query Rewrite:** The architecture utilizes StarRocks' ability to automatically route queries to the most efficient materialized view, ensuring that even ad-hoc reports benefit from pre-computed results. * **Balanced Refresh Strategies:** The research focused on optimizing the refresh intervals of these views to maintain high "freshness" while minimizing the overhead on the cluster resources. The adoption of a modern lakehouse architecture combining Apache Iceberg with a high-performance OLAP engine like StarRocks is a recommended strategy for organizations dealing with high-volume, real-time reporting. This approach effectively decouples storage and compute while providing the low-latency response times necessary for interactive data analysis.

naver

Naver Integrated Search LLM DevOps (opens in new tab)

Naver’s Integrated Search team is transitioning from manual fault response to an automated system using LLM Agents to manage the increasing complexity of search infrastructure. By integrating Large Language Models into the DevOps pipeline, the system evolves through accumulated experience, moving beyond simple alert monitoring to intelligent diagnostic analysis and action recommendation. ### Limitations of Traditional Fault Response * **Complex Search Flows:** Naver’s search architecture involves multiple interdependent layers, which makes manual root cause analysis slow and prone to human error. * **Fragmented Context:** Existing monitoring requires developers to manually synthesize logs and metrics from disparate telemetry sources, leading to high cognitive load during outages. * **Delayed Intervention:** Human-led responses often suffer from a "detection-to-action" lag, especially during high-traffic periods or subtle service regressions. ### Architecture of DevOps Agent v1 * **Initial Design:** Focused on automating basic data gathering and providing preliminary textual reports to engineers. * **Infrastructure Integration:** Built using a specialized software stack designed to bridge frontend (FE) and backend (BE) telemetry within the search infrastructure. * **Standardized Logic:** The v1 agent operated on a fixed set of instructions to perform predefined diagnostic tasks when triggered by specific system alarms. ### Evolution to DevOps Agent v2 * **Overcoming V1 Limitations:** The first iteration struggled with maintaining deep context and providing diverse actionable insights, necessitating a more robust agentic structure. * **Enhanced Memory and Learning:** V2 incorporates a more sophisticated architecture that allows the agent to reference historical failure data and learn from past incident resolutions. * **Advanced Tool Interaction:** The system was upgraded to handle more complex tool-calling capabilities, allowing the agent to interact more deeply with internal infrastructure APIs. ### System Operations and Evaluation * **Trigger Queue Management:** Implements a queuing system to efficiently process and prioritize multiple concurrent system alerts without overwhelming the diagnostic pipeline. * **Anomaly Detection:** Utilizes advanced detection methods to distinguish between routine traffic fluctuations and genuine service anomalies that require LLM intervention. * **Rigorous Evaluation:** The agent’s performance is measured through a dedicated evaluation framework that assesses the accuracy of its diagnoses against known ground-truth incidents. ### Scaling and Future Challenges * **Context Expansion:** Efforts are focused on integrating a wider range of metadata and environmental context to provide a holistic view of system health. * **Action Recommendation:** The system is moving toward suggesting specific recovery actions, such as rollbacks or traffic rerouting, rather than just identifying the problem. * **Sustainability:** Ensuring the DevOps Agent remains maintainable and cost-effective as the underlying search infrastructure and LLM models continue to evolve. Organizations managing high-scale search traffic should consider LLM-based agents as integrated infrastructure components rather than standalone tools. Moving from reactive monitoring to a proactive, experience-based agent system is essential for reducing the mean time to recovery (MTTR) in complex distributed environments.

naver

[DAN25] (opens in new tab)

Naver recently released the full video archives from its DAN25 conference, highlighting the company’s strategic roadmap for AI agents, Sovereign AI, and digital transformation. The sessions showcase how Naver is moving beyond general AI applications to implement specialized, real-time systems that integrate large language models (LLMs) directly into core services like search, commerce, and content. By open-sourcing these technical insights, Naver demonstrates its progress in building a cohesive AI ecosystem capable of handling massive scale and complex user intent. ### Naver PersonA and LLM-Based User Memory * The "PersonA" project focuses on building a "user memory" by treating fragmented logs across various Naver services as indirect conversations with the user. * By leveraging LLM reasoning, the system transitions from simple data tracking to a sophisticated AI agent that offers context-aware, real-time suggestions. * Technical hurdles addressed include the stable implementation of real-time log reflection for a massive user base and the selection of optimal LLM architectures for personalized inference. ### Trend Analysis and Search-Optimized Models * The Place Trend Analysis system utilizes ranking algorithms to distinguish between temporary surges and sustained popularity, providing a balanced view of "hot places." * LLMs and text mining are employed to move beyond raw data, extracting specific keywords that explain the underlying reasons for a location's trending status. * To improve search quality, Naver developed search-specific LLMs that outperform general models by using specialized data "recipes" and integrating traditional information retrieval with features like "AI briefing" and "AuthGR" for higher reliability. ### Unified Recommendation and Real-Time CRM * Naver Webtoon and Series replaced fragmented recommendation and CRM (Customer Relationship Management) models with a single, unified framework to ensure data consistency. * The architecture shifted from batch-based processing to a real-time, API-based serving system to reduce management complexity and improve the immediacy of personalized user experiences. * This transition focuses on maintaining a seamless UX by synchronizing different ML models under a unified serving logic. ### Scalable Log Pipelines and Infrastructure Stability * The "Logiss" pipeline manages up to tens of billions of logs daily, utilizing a Storm and Kafka environment to ensure high availability and performance. * Engineers implemented a multi-topology approach to allow for seamless, non-disruptive deployments even under heavy loads. * Intelligent features such as "peak-shaving" (distributing peak traffic to off-peak hours), priority-based processing during failures, and efficient data sampling help balance cost, performance, and stability. These sessions provide a practical blueprint for organizations aiming to scale LLM-driven services while maintaining infrastructure integrity. For developers and system architects, Naver’s transition toward unified ML frameworks and specialized, real-time data pipelines offers a proven model for moving AI from experimental phases into high-traffic production environments.

naver

@RequestCache: Developing a Custom (opens in new tab)

The development of `@RequestCache` addresses the performance degradation and network overhead caused by redundant external API calls or repetitive computations within a single HTTP request. By implementing a custom Spring-based annotation, developers can ensure that specific data is fetched only once per request and shared across different service layers. This approach provides a more elegant and maintainable solution than manual parameter passing or struggling with the limitations of global caching strategies. ### Addressing Redundant Operations in Web Services * Modern web architectures often involve multiple internal services (e.g., Order, Payment, and Notification) that independently request the same data, such as a user profile. * These redundant calls increase response times, put unnecessary load on external servers, and waste system resources. * `@RequestCache` provides a declarative way to cache method results within the scope of a single HTTP request, ensuring the actual logic or API call is executed only once. ### Limitations of Manual Data Passing * The common alternative of passing response objects as method parameters leads to "parameter drilling," where intermediate service layers must accept data they do not use just to pass it to a deeper layer. * In the "Strategy Pattern," adding a new data dependency to an interface forces every implementation to change, even those that have no use for the new parameter, which violates clean architecture principles. * Manual passing makes method signatures brittle and increases the complexity of refactoring as the call stack grows. ### The TTL Dilemma in Traditional Caching * Using Redis or a local cache with Time-To-Live (TTL) settings is often insufficient for request-level isolation. * If the TTL is set too short, the cache might expire before a long-running request finishes, leading to the very redundant calls the system was trying to avoid. * If the TTL is too long, the cache persists across different HTTP requests, which is logically incorrect for data that should be fresh for every new user interaction. ### Leveraging Spring’s Request Scope and Proxy Mechanism * The implementation utilizes Spring’s `@RequestScope` to manage the cache lifecycle, ensuring that data is automatically cleared when the request ends. * Under the hood, `@RequestScope` uses a Singleton Proxy that delegates calls to a specific instance stored in the `RequestContextHolder` for the current thread. * The cache relies on `RequestAttribute`, which uses `ThreadLocal` storage to guarantee isolation between different concurrent requests. * Lifecycle management is handled by Spring’s `FrameworkServlet`, which prevents memory leaks by automatically cleaning up request attributes after the response is sent. For applications dealing with deep call stacks or complex service interactions, a request-scoped caching annotation provides a robust way to optimize performance without sacrificing code readability. This mechanism is particularly recommended when the same data is needed across unrelated service boundaries within a single transaction, ensuring consistency and efficiency throughout the request lifecycle.