kakao

Kanana-2 Development Story (2 (opens in new tab)

Kakao’s development of the Kanana-2 model family represents a strategic shift toward Agentic AI, prioritizing complex reasoning and execution capabilities over simple conversational fluency. By implementing a sophisticated post-training pipeline—including a specialized Mid-training stage and refined reinforcement learning—the team successfully enhanced the model's instruction-following and tool-calling performance. This methodology ensures that the 30B parameter models excel in logical tasks and real-world agentic environments while maintaining high linguistic stability in both English and Korean. ## Mid-training and Catastrophic Forgetting Prevention * A 250B token Mid-training stage was introduced between Pre-training and Post-training to bridge the gap in reasoning, coding, and tool-calling capabilities. * The dataset comprised 200B tokens of high-quality reasoning data (Chain-of-Thought math and code) and 50B tokens of "replay" data from the original pre-training set. * This replay strategy specifically targeted "Catastrophic Forgetting," preventing the model from losing its Korean linguistic nuances and performance on benchmarks like KoMT-bench while it gained English-heavy reasoning skills. * Experimental results indicated that Mid-training serves as a foundational "force multiplier," leading to faster convergence and higher performance ceilings during subsequent Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) stages. ## Enhanced Instruction Following and Tool Calling * To optimize for Agentic AI, the developers focused on Instruction Following (IFEval) by synthesizing high-quality, long-form responses that strictly adhere to complex constraints. * Tool-calling capabilities were improved using "Rejection Sampling" (Iterative SFT), where model-generated trajectories are validated in a real execution environment; only successful outcomes are retained for training. * The training data was categorized into distinct buckets—such as Chat, Math, Code, and Tool Calling—allowing for a more balanced recipe compared to previous Kanana versions. * This approach specifically addressed multi-turn and multi-tool scenarios, ensuring the model can handle the recursive logic required for autonomous agents. ## Parallel Reinforcement Learning and Calibration Tuning * A "Parallel RL" framework was adopted to optimize different capabilities simultaneously: the "Chat" track focused on helpfulness and safety, while the "Logic" track focused on accuracy in math and programming. * The pipeline moved beyond standard SFT to include Reinforcement Learning from Human Feedback (RLHF), utilizing DPO and PPO-style methods to align the model with human preferences. * A final "Calibration Tuning" step was implemented to ensure the model’s internal confidence levels match its actual accuracy, effectively reducing hallucinations and improving reliability in technical tasks. * Comparative benchmarks show that the Kanana-2 Instruct and Thinking models significantly outperform earlier versions and rival larger open-source models in reasoning and coding benchmarks like HumanEval and GSM8K. The Kanana-2 development cycle demonstrates that achieving "Agentic" performance requires more than just scaling data; it requires a structured transition from general language understanding to execution-verified reasoning. For organizations building AI agents, the Kanana-2 post-training recipe suggests that integrating environment-validated feedback and balancing reasoning data with foundational language "replays" is critical for creating reliable, multi-functional models.

meta

Adapting the Facebook Reels RecSys AI Model Based on User Feedback - Engineering at Meta (opens in new tab)

Meta has enhanced the Facebook Reels recommendation engine by shifting focus from traditional engagement signals, like watch time and likes, to direct user feedback. By implementing the User True Interest Survey (UTIS) model, the system now prioritizes content that aligns with genuine user preferences rather than just short-term interactions. This shift has resulted in significant improvements in recommendation relevance, high-quality content delivery, and long-term user retention. **Limitations of Engagement-Based Metrics** * Traditional signals like "likes" and "watch time" are often noisy and may not reflect a user’s actual long-term interests. * Models optimized solely for engagement tend to favor short-term value over the long-term utility of the product. * Internal research found that previous heuristic-based interest models only achieved 48.3% precision in identifying what users truly care about. * Effective interest matching requires understanding nuanced factors such as production style, mood, audio, and motivation, which implicit signals often miss. **The User True Interest Survey (UTIS) Model** * Meta collects direct feedback via randomized, single-question surveys asking users to rate video interest on a 1–5 scale. * The raw survey data is binarized to denoise responses and weighted to correct for sampling and nonresponse bias. * The UTIS model functions as a lightweight "alignment model layer" built on top of the main multi-task ranking system. * The architecture uses existing model predictions as input features, supplemented by engineered features that capture content attributes and user behavior. **Integration into the Ranking Funnel** * **Late Stage Ranking (LSR):** The UTIS score is used as an additional input feature in the final value formula, allowing the system to boost high-interest videos and demote low-interest ones. * **Early Stage Ranking (Retrieval):** The model aggregates survey data to reconstruct user interest profiles, helping the system source more relevant candidates during the initial retrieval phase. * **Knowledge Distillation:** Large sequence-based retrieval models are aligned using UTIS predictions as labels through distillation objectives. **Performance and Impact** * The deployment of UTIS has led to a measurable increase in the delivery of niche, high-quality content. * Generic, popularity-based recommendations that often lack depth have been reduced. * Meta observed robust improvements across core metrics, including higher follow rates, more shares, and increased user retention. * The system now offers better interpretability, allowing engineers to understand which specific factors contribute to a user’s sense of "interest match." To continue improving the Reels ecosystem, Meta is focusing on doubling down on personalization by tackling challenges related to sparse data and sampling bias while exploring more advanced AI architectures to further diversify recommendations.

google

Unlocking health insights: Estimating advanced walking metrics with smartwatches (opens in new tab)

Google researchers have validated that smartwatches are a highly reliable and accurate platform for estimating complex spatio-temporal gait metrics, rivaling the performance of smartphone-based methods. By utilizing a multi-head deep learning model, the study demonstrates that wrist-worn devices can provide continuous, lab-grade health insights into a user's walking speed, step length, and balance without requiring the specific pocket placement or specialized laboratory equipment previously necessary for such data. ## Multi-Head Deep Learning for Wrist-Based Sensors * The researchers developed a temporal convolutional network (TCN) architecture designed to process raw inertial measurement unit (IMU) data, specifically 3-axis accelerometer and gyroscope signals sampled at 50 Hz. * Unlike traditional models that only track temporal events and are prone to integration drift, this multi-head approach directly estimates both unilateral and bilateral metrics simultaneously. * The model architecture extracts embeddings from the IMU signals and concatenates them with user height (a demographic scalar input) to improve the precision of spatial predictions. * The system estimates a comprehensive suite of metrics, including gait speed, double support time (the proportion of time both feet are on the ground), step length, swing time, and stance time. ## Large-Scale Validation and Study Protocol * To ensure rigorous results, the study involved a diverse cohort of 246 participants across two international sites, generating approximately 70,000 walking segments. * Ground truth measurements were captured using a professional-grade Zeno Gait Walkway system to provide high-precision reference data for comparison. * The study protocol included various walking conditions to test the model's versatility: a self-paced six-minute walk test (6MWT), fast-paced walking, and induced physical asymmetry created by wearing hinged knee braces at specific angles. * Researchers employed a five-fold cross-validation strategy, ensuring that all data from a single participant remained within a single split to prevent data leakage and ensure the model generalizes to new users. ## Clinical Validity and Comparative Performance * Smartwatch estimates demonstrated strong validity and excellent reliability, with Pearson correlation coefficients (r) and intraclass correlation coefficients (ICC) exceeding 0.80 for most metrics. * Performance comparisons showed non-significant differences in Mean Absolute Percentage Error (MAPE) between the Pixel Watch and Pixel phone, establishing the smartwatch as a viable alternative to smartphone-based tracking. * While double support time showed slightly lower but acceptable reliability (ICC 0.56–0.60), other metrics like step length and gait speed proved highly consistent across different walking speeds and styles. * The model’s success suggests that smartwatches can effectively bridge the gap in gait analysis, providing a more practical and consistent platform for continuous health tracking than handheld devices. This research establishes smartwatches as a powerful tool for longitudinal health monitoring, enabling the detection of neurological or musculoskeletal changes through passive, continuous gait analysis in everyday environments.

line

Building an Enterprise LLM Service 1 (opens in new tab)

LY Corporation’s engineering team developed an AI assistant for their private cloud platform, Flava, by prioritizing "context engineering" over traditional prompt engineering. To manage a complex environment of 260 APIs and hundreds of technical documents, they implemented a strategy of progressive disclosure to ensure the LLM receives only the most relevant information for any given query. This approach allows the assistant to move beyond simple RAG-based document summarization to perform active diagnostics and resource management based on real-time API data. ### Performance Limitations of Long Contexts * Research indicates that LLM performance can drop by 13.9% to 85% as context length increases, even if the model technically supports a large token window. * The phenomenon of "context rot" occurs when low-quality or irrelevant information is mixed into the input, causing the model to generate confident but incorrect answers. * Because LLMs are stateless, maintaining conversation history and processing dense JSON responses from multiple APIs quickly exhausts context windows and degrades reasoning quality. ### Progressive Disclosure and Tool Selection * The system avoids loading all 260+ API definitions at once; instead, it analyzes the user's intent to select only the necessary tools, such as loading only Redis-related APIs when a user asks about a cluster. * Specific product usage hints, such as the distinction between private and CDN settings for Object Storage, are injected only when those specific services are invoked. * This phased approach significantly reduces token consumption and prevents the model from being overwhelmed by irrelevant technical specifications. ### Response Guidelines and the "Mock Tool Message" Strategy * The team distinguished between "System Prompts" (global rules) and "Response Guidelines" (situational instructions), such as directing users to a console UI before suggesting CLI commands. * Injecting specific guidelines into the system prompt often caused "instruction conflict," where the LLM might hallucinate information to satisfy a guideline while ignoring core requirements like using search tools. * To resolve these conflicts, the team utilized "ToolMessages" to inject guidelines; by formatting instructions as if they were results from a tool execution, the LLM treats the information as factual context rather than a command that might override the system prompt. To build a robust enterprise LLM service, developers should focus on dynamic context management rather than static prompt optimization. Treating operational guidelines as external data via mock tool messages, rather than system instructions, provides a scalable way to reduce hallucinations and maintain high performance across hundreds of integrated services.

naver

FE News: January 2 (opens in new tab)

The January 2026 FE News highlights a significant shift toward client-side intelligence and deeper architectural transparency in modern web development. By exploring advanced visualization tools for React Server Components and the integration of AI within design systems and on-device environments, the industry is moving toward more automated and efficient frontend workflows. This collection underscores how foundational technologies like WebGPU and standardized design tokens are becoming essential for building the next generation of AI-driven user experiences. ### Visualizing React Server Components * Dan Abramov’s RSC Explorer allows developers to step through and decompose the RSC protocol stream directly within the browser. * The tool features four specialized panels—Server, Client, Flight, and Preview—to visualize the complete data flow and protocol structure. * It utilizes React's native reader/writer to ensure the output matches actual protocol behavior, making it an ideal resource for debugging streaming (Suspense), Client References, Server Actions, and Router refreshes. ### The Rise of Client-Side AI and Agents * The Web AI Summit 2025 highlights a transition from server-dependent AI to local, browser-based execution using Transformers.js for 100% local ML model processing. * New frameworks like webMCP allow developers to define site functions as tools that can be consumed by browser-based AI agents, fostering a more interactive agent-based UX. * Technical advancements in Wasm, WebGPU, and WebNN are facilitating high-performance on-device inference, enabling developers to build complex AI features without heavy reliance on backend APIs. ### AI Research and Development Milestones * Google’s Jeff Dean provides insights into AI trends that influence not just individual features, but the underlying system architecture and data workflows of modern products. * "The Thinking Game," a documentary covering five years of DeepMind's history, chronicles the team's pursuit of Artificial General Intelligence (AGI) and the development of AlphaFold. * These resources suggest that frontend developers should view AI as a structural change to product design rather than a simple functional add-on. ### Automating Markup with Design Systems * Naver Financial has shared practical results of using Figma Code Connect and specific AI instructions to automate component-based markup generation. * The experiment proved that training AI on standardized design tokens and component structures allows for the generation of frontend code that is ready for immediate development. * However, complex layouts and responsive design still require human intervention, reinforcing the idea that the efficiency of AI automation is directly tied to the quality of design system documentation and standardization. Frontend developers should prioritize mastering client-side AI technologies and visualization tools to stay ahead of architectural shifts. As AI becomes more integrated into the development lifecycle, maintaining highly standardized design systems and understanding internal framework protocols like RSC will be the primary drivers of professional productivity.

meta

CSS at Scale With StyleX - Engineering at Meta (opens in new tab)

Scaling CSS within massive codebases presents unique challenges that traditional styling methods often struggle to solve effectively. Meta’s StyleX addresses these issues by offering a system that combines the intuitive ergonomics of CSS-in-JS with the runtime performance of static CSS. By prioritizing atomic styling and definition deduplication, StyleX minimizes bundle sizes and has become the primary styling standard across Meta's entire suite of applications. ### Performance-Driven Styling Architecture * Combines a CSS-in-JS developer experience with a compiler that outputs static CSS to ensure high performance and zero runtime overhead. * Utilizes atomic styling to break down CSS into small, reusable classes, which prevents style sheets from growing linearly with the size of the codebase. * Automatically deduplicates style definitions during the build process, significantly reducing the final bundle size delivered to the client. * Exposes a simple, consistent API that allows developers to manage complex styles and themes while maintaining type safety. ### Standardization and Industry Adoption * Serves as the foundational styling system for Meta’s most prominent platforms, including Facebook, Instagram, WhatsApp, Messenger, and Threads. * Gained significant industry traction beyond Meta, with large-scale organizations such as Figma and Snowflake adopting it for their own web applications. * Acts as an open-source force multiplier, allowing Meta engineers and the broader community to collaborate on solving CSS-at-scale problems. * Provides a mature ecosystem that bridges the gap between the flexibility of JavaScript-based styling and the efficiency of traditional CSS. For engineering teams managing large-scale web applications where bundle size and styling maintainability are critical, StyleX offers a battle-tested framework. Developers can leverage this tool to achieve the performance of static CSS without losing the expressive power of modern JavaScript tooling.

aws

AWS Weekly Roundup: AWS Lambda for .NET 10, AWS Client VPN quickstart, Best of AWS re:Invent, and more (January 12, 2026) | AWS News Blog (opens in new tab)

The AWS Weekly Roundup for January 2026 highlights a significant push toward modernization, headlined by the introduction of .NET 10 support for AWS Lambda and Apache Airflow 2.11 for Amazon MWAA. To encourage exploration of these and other emerging technologies, AWS has revamped its Free Tier to offer new users up to $200 in credits and six months of risk-free experimentation. These updates collectively aim to streamline serverless development, enhance container storage efficiency, and provide more robust authentication options for messaging services. ### Modernized Runtimes and Orchestration * AWS Lambda now supports .NET 10 as both a managed runtime and a container base image, with AWS providing automatic updates to these environments as they become available. * Amazon Managed Workflows for Apache Airflow (MWAA) has added support for version 2.11, which serves as a critical stepping stone for users preparing to migrate to Apache Airflow 3. ### Infrastructure and Resource Management * Amazon ECS has extended support for `tmpfs` mounts to Linux tasks running on AWS Fargate and Managed Instances; this allows developers to utilize memory-backed file systems for containerized workloads to avoid writing sensitive or temporary data to task storage. * AWS Config has expanded its monitoring capabilities to discover, assess, and audit new resource types across Amazon EC2, Amazon SageMaker, and Amazon S3 Tables. * A new AWS Client VPN quickstart was released, providing a CloudFormation template and a step-by-step guide to automate the deployment of secure client-to-site VPN connections. ### Security and Messaging Enhancements * Amazon MQ for RabbitMQ brokers now supports HTTP-based authentication, which can be enabled and managed through the broker’s configuration file. * RabbitMQ brokers on Amazon MQ also now support certificate-based authentication using mutual TLS (mTLS) to improve the security posture of messaging applications. ### Educational Initiatives and Community Events * New AWS Free Tier accounts now include a 6-month trial period featuring $200 in credits and access to over 30 always-free services, specifically targeting developers interested in AI/ML and compute experimentation. * AWS published a curated "Best of re:Invent 2025" playlist, featuring high-impact sessions and keynotes for those who missed the live event. * The 2026 AWS Summit season begins shortly, with upcoming events scheduled for Dubai on February 10 and Paris on March 10. Developers should take immediate advantage of the new .NET 10 Lambda runtime for serverless applications and review the updated ECS `tmpfs` documentation to optimize container performance. For those new to the platform, the expanded Free Tier credits provide an excellent opportunity to prototype AI/ML workloads with minimal financial risk.

google

Dynamic surface codes open new avenues for quantum error correction (opens in new tab)

Google Research has demonstrated the operation of dynamic surface codes for quantum error correction, marking a significant shift from traditional static circuit architectures. By alternating between different circuit constructions and re-tiling "detecting regions" in each cycle, these dynamic circuits offer greater flexibility to avoid hardware defects and suppress correlated errors. Experimental results on the Willow processor show that these methods can match the performance of static codes while significantly simplifying the physical design and fabrication of quantum chips. ## Error Triangulation via Dynamic Detecting Regions Quantum error correction (QEC) functions by localizing physical errors within specific "detecting regions" over multiple cycles to prevent them from affecting logical information. While standard surface codes use a static, square tiling for these regions, dynamic codes periodically change the tiling pattern. * Dynamic circuits allow the system to "deform" the detecting regions in spacetime, providing multiple perspectives to triangulate errors. * This approach enables the use of different gate types and connectivity layouts that are not possible with fixed, repetitive cycles. * The flexibility of dynamic re-tiling allows the system to sidestep common superconducting qubit issues such as "dropouts" (failed qubits or couplers) and leakage out of the computational subspace. ## Quantum Error Correction on Hexagonal Lattices Traditional square lattices require each physical qubit to connect to four neighbors, which creates significant overhead in wiring and coupler density. Dynamic circuits enable the use of a hexagonal lattice, where each qubit only requires three couplers. * The hexagonal code alternates between two distinct cycle types, utilizing one of the three couplers twice per cycle to maintain error detection capabilities. * Testing on the Willow processor showed that scaling the hexagonal code from distance 3 to 5 improved the logical error rate by a factor of 2.15, matching the performance of standard static circuits. * Reducing coupler density simplifies the optimization of qubit and gate frequencies, leading to a 15% improvement in simulated error suppression compared to four-coupler designs. ## Walking Circuits to Mitigate Leakage Superconducting qubits are prone to "leakage," where a qubit exits its intended computational states (0 and 1) into a higher energy state (2). In static circuits, repeated measurements on the same physical qubits can cause these leakage errors to accumulate and spread. * "Walking" circuits solve this by shifting the roles of data and measurement qubits across the lattice in each cycle. * By constantly moving the location where errors are measured, the circuit effectively "flushes out" leakage and other correlated errors before they can damage logical information. * Experiments confirmed that walking circuits achieve error suppression equivalent to static circuits while offering a more robust defense against long-term error correlations. ## Flexibility with iSWAP Entangling Gates Most superconducting quantum processors are optimized for Controlled-Z (CZ) gates, but dynamic circuits prove that QEC can be effectively implemented using alternative gates like iSWAP. * The research team demonstrated a dynamic surface code that utilizes iSWAP gates, which are native to many quantum hardware architectures. * This flexibility ensures that QEC is not tethered to a specific gate set, allowing hardware designers to choose entangling operations that offer the highest physical fidelity for their specific device. The move toward dynamic surface codes suggests a future where quantum processors are more resilient to manufacturing imperfections. By adopting hexagonal layouts and walking circuits, developers can reduce hardware complexity and mitigate physical noise, providing a more scalable path toward fault-tolerant quantum computing.

google

Next generation medical image interpretation with MedGemma 1.5 and medical speech to text with MedASR (opens in new tab)

Google Research has introduced MedGemma 1.5 4B and MedASR, expanding its suite of open medical AI models to support more complex clinical workflows. These updates significantly enhance the interpretation of high-dimensional imaging and medical speech-to-text, providing a compute-efficient foundation for healthcare developers to build upon. By maintaining an open-access model available on Hugging Face and Vertex AI, Google aims to accelerate the integration of multimodal AI into real-world medical applications. ### Multimodal Advancements in MedGemma 1.5 The latest update to the MedGemma 4B model focuses on high-dimensional and longitudinal data, moving beyond simple 2D image interpretation. * **3D Medical Imaging:** The model now supports volumetric representations from CT scans and MRIs, as well as whole-slide histopathology imaging. * **Longitudinal Review:** New capabilities allow for the review of chest X-ray time series, helping clinicians track disease progression over time. * **Anatomical Localization:** Developers can use the model to identify and localize specific anatomical features within chest X-rays. * **Document Understanding:** Enhanced support for extracting structured data from complex medical lab reports and documents. * **Edge Capability:** The 4B parameter size is specifically designed to be small enough to run offline while remaining accurate enough for core medical reasoning tasks. ### Medical Speech-to-Text with MedASR MedASR is a specialized automated speech recognition (ASR) model designed to bridge the gap between clinical dialogue and digital documentation. * **Clinical Dictation:** The model is specifically fine-tuned for medical terminology and the unique nuances of clinical dictation. * **Integrated Reasoning:** MedASR is designed to pair seamlessly with MedGemma, allowing transcribed text to be immediately processed for advanced medical reasoning or summarization. * **Accessibility:** Like other HAI-DEF models, it is free for research and commercial use and hosted on both Hugging Face and Google Cloud’s Vertex AI. ### Performance Benchmarks and Community Impact Google is incentivizing innovation through improved performance metrics and community-driven challenges. * **Accuracy Gains:** Internal benchmarks show MedGemma 1.5 improved disease-related CT classification by 3% and MRI classification by 14% compared to the previous version. * **MedGemma Impact Challenge:** A Kaggle-hosted hackathon with $100,000 in prizes has been launched to encourage developers to find creative applications for these multimodal tools. * **Model Collection:** The update complements existing tools like the MedSigLIP image encoder and the larger MedGemma 27B model, which remains the preferred choice for complex, text-heavy medical applications. Developers and researchers are encouraged to utilize MedGemma 1.5 for tasks requiring efficient, offline multimodal processing, while leveraging MedASR to automate clinical documentation. By participating in the MedGemma Impact Challenge, the community can help define the next generation of AI-assisted medical diagnostics and workflows.

google

Hard-braking events as indicators of road segment crash risk (opens in new tab)

Google Research has established a statistically significant correlation between hard-braking events (HBEs) collected via Android Auto and actual road crash rates. By utilizing HBEs as a "leading" indicator rather than relying on sparse, lagging historical crash data, researchers can proactively identify high-risk road segments with much greater speed and spatial granularity. This validation suggests that connected vehicle data can serve as a scalable proxy for traditional safety assessments. ### Data Density and Scalability * HBEs—defined as forward deceleration exceeding -3m/s²—provide a signal that is 18 times denser than reported crash data. * While crashes are statistically rare and can take years to provide a valid safety profile for a specific road segment, HBEs offer a continuous stream of information. * This high density allows for the creation of a comprehensive "safety map" that includes local and arterial roads where crash reporting is often inconsistent or sparse. ### Statistical Validation of HBEs * Researchers employed negative binomial regression models to analyze 10 years of public crash data from California and Virginia alongside anonymized HBE data. * The models controlled for confounding factors such as traffic volume, segment length, road type (local, arterial, highway), and infrastructure dynamics like slope and lane changes. * The results confirmed a consistent positive association between HBE frequency and crash rates across all road types, proving HBEs are a reliable surrogate for risk regardless of geography. ### High-Risk Identification Case Study * An analysis of a freeway merge connecting Highway 101 and Highway 880 in California served as a practical validation of the metric. * This specific segment was found to have an HBE rate 70 times higher than the state average, correlating with a historical record of one crash every six weeks. * The HBE signal successfully flagged this location as being in the top 1% of high-risk segments without needing years of collision reports to confirm the danger, demonstrating its utility in identifying "black spots" early. ### Real-World Application and Road Management * Validating HBEs transforms raw sensor data into a trusted tool for urban planners and road authorities to perform network-wide safety assessments. * This approach allows for proactive infrastructure interventions, such as adjusting signage or merge patterns, before fatalities or injuries occur. * The findings support the integration of connected vehicle insights into platforms like Google Maps to help authorities manage road safety more dynamically.

google

NeuralGCM harnesses AI to better simulate long-range global precipitation (opens in new tab)

NeuralGCM represents a significant evolution in atmospheric modeling by combining traditional fluid dynamics with neural networks to solve the long-standing challenge of simulating global precipitation. By training the AI component directly on high-quality NASA satellite observations rather than biased reanalysis data, the model achieves unprecedented accuracy in predicting daily weather cycles and extreme rainfall events. This hybrid approach offers a faster, more precise tool for both medium-range weather forecasting and multi-decadal climate projections. ## The Limitations of Cloud Parameterization * Precipitation is driven by cloud processes occurring at scales as small as 100 meters, which is far below the kilometer-scale resolution of global weather models. * Traditional models rely on "parameterizations," or mathematical approximations, to estimate how these small-scale events affect the larger atmosphere. * Because these approximations are often simplified, traditional models struggle to accurately capture the complexity of water droplet formation and ice crystal growth, leading to errors in long-term forecasts. ## Training on Direct Satellite Observations * Unlike previous AI models trained on "reanalyses"—which are essentially simulations used to fill observational gaps—NeuralGCM is trained on NASA satellite-based precipitation data spanning 2001 to 2018. * The model utilizes a differentiable dynamical core, an architecture that allows the neural network to learn the effects of small-scale events directly from physical observations. * By bypassing the weaknesses inherent in reanalysis data, the model effectively creates a machine-learned parameterization that is more faithful to real-world cloud physics. ## Performance in Weather and Climate Benchmarks * At a resolution of 280 km, NeuralGCM outperforms leading operational models in medium-range forecasts (up to 15 days) and matches the precision of sophisticated multi-decadal climate models. * The model shows a marked improvement in capturing precipitation extremes, particularly for the top 0.1% of rainfall events. * Evaluation through WeatherBench 2 demonstrates that NeuralGCM accurately reproduces the diurnal (daily) weather cycle, a metric where traditional physics-based models frequently fall short. NeuralGCM provides a highly efficient and accessible framework for researchers and city planners who need to simulate long-range climate scenarios, such as 100-year storms or seasonal agricultural cycles. Its ability to maintain physical consistency while leveraging the speed of AI makes it a powerful candidate for the next generation of global atmospheric modeling.

toss

Managing Thousands of API/Batch Servers (opens in new tab)

Toss Payments manages thousands of API and batch server configurations that handle trillions of won in transactions, where a single typo in a JVM setting can lead to massive financial infrastructure failure. To solve the risks associated with manual "copy-paste" workflows and configuration duplication, the team developed a sophisticated system that treats configuration as code. By implementing layered architectures and dynamic templates, they created a testable, unified environment capable of managing complex hybrid cloud setups with minimal human error. ## Overlay Architecture for Hierarchical Control * The team implemented a layered configuration system consisting of `global`, `cluster`, `phase`, and `application` levels. * Settings are resolved by priority, where lower-level layers override higher-level defaults, allowing servers to inherit common settings while maintaining specific overrides. * This structure allows the team to control environment-specific behaviors, such as disabling canary deployments in development environments, from a single centralized directory. * The directory structure maps files 1:1 to their respective layers, ensuring that naming conventions drive the CI/CD application process. ## Solving Duplication with Template Patterns * Standard YAML overlays often fail when dealing with long strings or arrays, such as `JVM_OPTION`, because changing a single value usually requires redefining the entire block. * To prevent the proliferation of nearly identical environment variables, the team introduced a template pattern using placeholders like `{{MAX_HEAP}}`. * Developers can modify specific parameters at the application layer while the core string remains defined at the global layer, significantly reducing the risk of typos. * This approach ensures that critical settings, like G1GC parameters or heap region sizes, remain consistent across the infrastructure unless explicitly changed. ## Dynamic and Conditional Configuration Logic * The system allows for "evolutionary" configurations where Python scripts can be injected to generate dynamic values, such as random JMX ports or data fetched from remote APIs. * Advanced conditional logic was added to handle complex deployment scenarios, enabling environment variables to change their values automatically based on the target cluster name (e.g., different profiles for AWS vs. IDC). * By treating configuration as a living codebase, the team can adapt to new infrastructure requirements without abandoning their core architectural principles. ## Reliable Batch Processing through Simplicity * For batch operations handling massive settlement volumes, the team prioritized "appropriate technology" and simplicity to minimize failure points. * They chose Jenkins for its low learning curve and reliability, despite its lack of native GitOps support. * To address inconsistencies in manual UI entries and varying Java versions across machines, they standardized the batch infrastructure to ensure that high-stakes financial calculations are executed in a controlled, predictable environment. The most effective way to manage large-scale infrastructure is to transition from static, duplicated configuration files to a dynamic, code-centric system. By combining an overlay architecture for hierarchy and a template pattern for granular changes, organizations can achieve the flexibility needed for hybrid clouds while maintaining the strict safety standards required for financial systems.

daangn

Daangn's User Behavior (opens in new tab)

Daangn transitioned its user behavior log management from a manual, code-based Git workflow to a centralized UI platform called Event Center to improve data consistency and operational efficiency. By automating schema creation and enforcing standardized naming conventions, the platform reduced the technical barriers for developers and analysts while ensuring high data quality for downstream analysis. This transition has streamlined the entire data lifecycle, from collection in the mobile app to structured storage in BigQuery. ### Challenges of Code-Based Schema Management Prior to Event Center, Daangn managed its event schemas—definitions that describe the ownership, domain, and custom parameters of a log—using Git and manual JSON files. This approach created several bottlenecks for the engineering team: * **High Entry Barrier**: Users were required to write complex Spark `StructType` JSON files, which involved managing nested structures and specific metadata fields like `nullable` and `type`. * **Inconsistent Naming**: Without a central enforcement mechanism, event names followed different patterns (e.g., `item_click` vs. `click_item`), making it difficult for analysts to discover relevant data. * **Operational Friction**: Every schema change required a Pull Request (PR), manual review by the data team, and a series of CI checks, leading to slow iteration cycles and frequent communication overhead. ### The User Behavior Log Pipeline To support data-driven decision-making, Daangn employs a robust pipeline that processes millions of events daily through several critical stages: * **Collection and Validation**: Events are sent from the mobile SDK to an event server, which performs initial validation before passing data to GCP Pub/Sub. * **Streaming Processing**: GCP Dataflow handles real-time deduplication, field validation, and data transformation (flattening) to prepare logs for storage. * **Storage and Accessibility**: Data is stored in Google Cloud Storage and BigQuery, where custom parameters defined in the schema are automatically expanded into searchable columns, removing the need for complex JSON parsing in SQL. ### Standardizing Discovery via Event Center The Event Center platform was designed to transform log management into a user-friendly, UI-driven experience while maintaining technical rigor. * **Standardized Naming Conventions**: The platform enforces a strict "Action-Object-Service" naming rule, ensuring that all events are categorized logically across the entire organization. * **Recursive Schema Builder**: To handle the complexity of nested JSON data, the team built a UI component that uses a recursive tree structure, allowing users to define deep data hierarchies without writing code. * **Centralized Dictionary**: The platform serves as a "single source of truth" where any employee can search for events, view their descriptions, and identify the team responsible for specific data points. ### Technical Implementation and Integration The system architecture was built to bridge the gap between a modern web UI and the existing Git-based infrastructure. * **Tech Stack**: The backend is powered by Go (Gin framework) and PostgreSQL (GORM), while the frontend utilizes React, TypeScript, and TanStack Query for state management. * **Automated Git Sync**: When a user saves a schema in Event Center, the system automatically triggers a GitHub Action that generates the necessary JSON files and pushes them to the repository, maintaining the codebase as the ultimate source of truth while abstracting the complexity. * **Real-time Validation**: The UI provides immediate feedback on data types and naming errors, preventing invalid schemas from reaching the production pipeline. Implementing a dedicated log management platform like Event Center is highly recommended for organizations scaling their data operations. Moving away from manual file management to a UI-based system not only reduces the risk of human error but also democratizes data access by allowing non-engineers to define and discover the logs they need for analysis.

toss

Rethinking Design Systems (opens in new tab)

Toss Design System (TDS) argues that as organizations scale, design systems often become a source of friction rather than efficiency, leading teams to bypass them through "forking" or "detaching" components. To prevent this, TDS treats the design system as a product that must adapt to user demand rather than a set of rigid constraints to be enforced. By shifting from a philosophy of control to one of flexible expansion, they ensure that the system remains a helpful tool rather than an obstacle. ### The Limits of Control and System Fragmentation * When a design system is too rigid, product teams often fork packages to make minor adjustments, which breaks the link to central updates and creates UI inconsistencies. * Treating "system bypasses" as user errors is ineffective; instead, they should be viewed as unmet needs in the system's "supply." * The goal of a modern design system should be to reduce the reason to bypass the system by providing natural extension points. ### Comparing Flat and Compound API Patterns * **Flat Pattern:** These components hide internal structures and use props to manage variations (e.g., `title`, `description`). While easy to use, they suffer from "prop bloat" as more edge cases are added, making long-term maintenance difficult. * **Compound Pattern:** This approach provides sub-components (e.g., `Card.Header`, `Card.Body`) for the user to assemble manually. This offers high flexibility for unexpected layouts but increases the learning curve and the amount of boilerplate code required. ### The Hybrid API Strategy * TDS employs a hybrid approach, offering both Flat APIs for common, simple use cases and Compound APIs for complex, customized needs. * Developers can choose a `FlatCard` for speed or a `Compound Card` when they need to inject custom elements like badges or unique button placements. * To avoid the burden of maintaining two separate codebases, TDS uses a "primitive" layer where the Flat API is simply a pre-assembled version of the Compound components. Design systems should function as guardrails that guide developers toward consistency, rather than fences that stop them from solving product-specific problems. By providing flexible architecture that supports exceptions, a system can maintain its relevance and ensure that teams stay within the ecosystem even as their requirements evolve.

line

Code Quality Improvement Techniques Part (opens in new tab)

LY Corporation’s technical review highlights that making a class open for inheritance imposes a "tax" on its internal constraints, particularly immutability. While developers often use inheritance to create specialized versions of a class, doing so with immutable types can allow subclasses to inadvertently or intentionally break the parent class's guarantees. To ensure strict data integrity, the post concludes that classes intended to be immutable should be made final or designed around read-only interfaces rather than open for extension. ### The Risks of Open Immutable Classes * Kotlin developers often wrap `IntArray` in an `ImmutableIntList` to avoid the overhead of boxed types while ensuring the collection remains unchangeable. * If `ImmutableIntList` is marked as `open`, a developer might create a `MutableIntList` subclass that adds a `set` method to modify the internal `protected valueArray`, violating the "Immutable" contract of the parent type. * Even if the internal state is `private`, a subclass can override the `get` method to return dynamic or state-dependent values, effectively breaking the expectation that the data remains constant. * These issues demonstrate that any class with a "fundamental" name should be carefully guarded against unexpected inheritance in different modules or packages. ### Establishing Safe Inheritance Hierarchies * Mutable objects should not inherit from immutable objects, as this inherently violates the immutability constraints established by the parent. * Conversely, immutable objects should not inherit from mutable ones; this often leads to runtime errors (such as `UnsupportedOperationException`) when a user attempts to call modification methods like `add` or `set` on an immutable instance. * The most effective design pattern is to use a "read-only" (unmodifiable) interface as a common parent, similar to how Kotlin distinguishes between `List` and `MutableList`. * In this structure, mutable classes can inherit from the read-only parent without issue (adding new methods), and immutable classes can inherit from the read-only parent while adding stricter internal constraints. To maintain high code quality and prevent logic errors, developers should default to making classes final when immutability is a core requirement. If shared functionality is needed across different types of lists, utilize composition or a shared read-only interface to ensure that the "immutable" label remains a truthful guarantee.