naver Dec 3, 2025

VLOps:Event-driven MLOps & Omni-Evaluator (opens in new tab)

multimodal-ai mlops orchestration event-driven-architecture llm-evaluation vlops kubeflow loose-coupling

Naver’s VLOps framework introduces an event-driven approach to MLOps, designed to overcome the rigidity of traditional pipeline-based systems like Kubeflow. By shifting from a monolithic pipeline structure to a system governed by autonomous sensors and typed messages, Naver has achieved a highly decoupled and scalable environment for multimodal AI development. This architecture allows for seamless functional expansion and cross-cloud compatibility, ultimately simplifying the transition from model training to large-scale evaluation and deployment.

Event-Driven MLOps Architecture

Operations such as training, evaluation, and deployment are defined as "Typed Messages," which serve as the primary units of communication within the system.
An "Event Sensor" acts as the core logic hub, autonomously detecting these messages and triggering the corresponding tasks without requiring a predefined, end-to-end pipeline.
The system eliminates the need for complex version management of entire pipelines, as new features can be integrated simply by adding new message types.
This approach ensures loose coupling between evaluation and deployment systems, facilitating easier maintenance and infrastructure flexibility.

Omni-Evaluator and Unified Benchmarking

The Omni-Evaluator serves as a centralized platform that integrates various evaluation engines and benchmarks into a single workflow.
It supports real-time monitoring of model performance, allowing researchers to track progress during the training and validation phases.
The system is designed specifically to handle the complexities of Multimodal LLMs, providing a standardized environment for diverse testing scenarios.
User-driven triggers are supported, enabling developers to initiate specific evaluation cycles manually when necessary.

VLOps Dashboard and User Experience

The VLOps Dashboard acts as a central hub where users can manage the entire ML lifecycle without needing deep knowledge of the underlying orchestration logic.
Users can trigger complex pipelines simply by issuing a message, abstracting the technical difficulties of cloud infrastructure.
The dashboard provides a visual interface for monitoring events, message flows, and evaluation results, improving overall transparency for data scientists and researchers.

For organizations managing large-scale multimodal models, moving toward an event-driven architecture is highly recommended. This model reduces the overhead of maintaining rigid pipelines and allows engineering teams to focus on model quality rather than infrastructure orchestration.