daangn

Redux for Servers: Developing a (opens in new tab)

Traditional CRUD-based architectures often struggle to meet complex backend requirements such as audit logging, version history, and state rollbacks. To address these challenges, Daangn’s Frontend Core team developed **Ventyd**, an open-source TypeScript library that implements event sourcing on the server using patterns familiar to Redux users. By shifting the focus from storing "current state" to storing a "history of events," developers can build more traceable and resilient systems. ### Limitations of Traditional CRUD * Standard CRUD (Create, Read, Update, Delete) patterns only record the final state of data, losing the context of "why" or "how" a change occurred. * Implementing complex features like approval workflows or history tracking usually requires manual table management, such as adding `status` columns or creating separate history tables. * Rollback logic in CRUD is often fragile and requires complex custom code to revert data to a previous specific state. ### The Event Sourcing Philosophy * Instead of overwriting rows in a database, event sourcing records every discrete action (e.g., "Post Created," "Post Approved," "Profile Updated") as an immutable sequence. * The system provides a built-in audit log, ensuring every change is attributed to a specific user, time, and reason. * State can be reconstructed for any point in time by "replaying" events, enabling seamless "time travel" and easier debugging. * It allows for deeper business insights by providing a full narrative of data changes rather than just a snapshot. ### Redux as a Server-Side Blueprint * The library leverages the familiarity of Redux to bridge the gap between frontend and backend engineering. * Just as Redux uses **Actions** and **Reducers** to manage state in the browser, event sourcing uses **Events** and **Reducers** to manage state in the database. * The primary difference is persistence: Redux manages state in memory, while Ventyd persists the event stream to a database for permanent storage. ### Technical Implementation with Ventyd * **Type-Safe Schemas**: Developers use `defineSchema` to define the shape of both the events and the resulting state, ensuring strict TypeScript validation. * **Validation Library Support**: Ventyd is flexible, supporting various validation libraries including Valibot, Zod, TypeBox, and ArkType. * **Reducer Logic**: The `defineReducer` function centralizes how the state evolves based on incoming events, making state transitions predictable and easy to test. * **Database Agnostic**: The library is designed to be flexible regarding the underlying storage, allowing it to integrate with different database systems. Ventyd offers a robust path for teams needing more than what basic CRUD can provide, particularly for internal tools requiring high accountability. By adopting this event-driven approach, developers can simplify the implementation of complex business logic while maintaining a clear, type-safe history of every action within their system.

line

Code Quality Improvement Techniques Part 30 (opens in new tab)

Code quality often suffers when functions share implicit dependencies, where the correct behavior of one relies on the state or validation provided by another. This "invisible" connection creates fragile code that is prone to runtime errors and logic mismatches during refactoring or feature expansion. To solve this, developers should consolidate related logic or make dependencies explicit to ensure consistency and safety. ## Problems with Implicit Function Dependencies When logic is split across separate functions—such as one for validation (`isContentValid`) and another for processing (`getMessageText`)—developers often rely on undocumented preconditions. * **Fragile Runtime Safety:** In the provided example, `getMessageText` throws a runtime error if called on invalid data, assuming the caller has already checked `isContentValid`. * **Maintenance Burden:** When new data types (e.g., a new message type) are added, developers must remember to update both functions to keep them in sync, increasing the risk of "forgotten" updates. * **Hidden Logic Flow:** Callers might not realize the two functions are linked, leading to improper usage where the transformation function is called without the necessary prior validation. ## Consolidating Logic for Single-Source Truth The most effective way to eliminate implicit dependencies is to merge filtering and transformation into a single function. This ensures that the code cannot reach a processing state without passing through the necessary logic. * **Nullable Returns:** By changing the transformation function to return a nullable type (`String?`), the function can signal that a piece of data is "invalid" or "empty" directly through its return value. * **Simplified Caller Logic:** The UI layer no longer needs to call two separate functions; it simply checks if the result of the transformation is null to determine visibility. * **Elimination of Redundant Branches:** This approach reduces the number of `when` or `if-else` blocks that need to be maintained across the codebase. ## Establishing Explicit Consistency In scenarios where separate functions for validation and transformation are required for clarity or architectural reasons, the validation logic should be defined in terms of the transformation. * **Dependent Validation:** Instead of writing a separate `when` block for `isContentValid`, the function should simply check if `getMessageText` returns a non-null value. * **Guaranteed Synchronization:** This structure makes the relationship between the two functions explicit and guarantees that if a message is deemed "valid," it will always produce a valid text output. * **Improved Documentation:** Defining functions this way serves as self-documenting code, showing future developers exactly how the two operations are linked. When functions share a "red thread" of logic, they should either be merged or structured so that one acts as the source of truth for the other. By removing the need for callers to remember implicit preconditions, you reduce the surface area for bugs and make the codebase significantly easier to extend.

naver

Smart Store Center's Zero- (opens in new tab)

Smart Store Center successfully migrated its legacy platform from Oracle to MySQL to overcome performance instability caused by resource contention and to reduce high licensing costs. By implementing a "dual write" strategy, the team achieved a zero-downtime transition while maintaining the ability to roll back immediately without data loss. This technical journey highlights the use of proxy data sources and transaction synchronization to ensure data integrity across disparate database environments. ## Zero-Downtime Migration via Dual Writing * The migration strategy relied on "dual writing," where all Create, Update, and Delete (CUD) operations are performed on both the legacy Oracle and the new MySQL databases. * In the pre-migration phase, Oracle served as the primary source for all traffic while MySQL recorded writes in the background to build a synchronized state. * Once data was fully migrated and verified, the primary traffic was shifted to MySQL, with background writes continuing to Oracle to allow for an instantaneous rollback if performance issues occurred. * This approach decoupled the database switch from application deployment, providing a safety net against critical failures that a simple redeploy could not fix. ## Technical Implementation for JPA * To capture and replicate queries, the team utilized the `datasource-proxy` library, which allowed them to intercept Oracle queries and execute them against a separate MySQL DataSource. * To prevent MySQL write failures from impacting the primary Oracle transactions, writes to the secondary database were managed using `TransactionSynchronizationManager`. * By executing MySQL queries during the `afterCommit` phase, the team ensured that the primary service remained stable even if the secondary database encountered errors or performance bottlenecks. * The transition required modifying JPA Entity configurations, such as changing primary key generation from Oracle Sequences to MySQL’s `IDENTITY` (auto-increment) and adjusting `columnDefinition` for types like `text`, `longtext`, and `decimal`. ## Centralized MyBatis Strategy * To avoid modifying thousands of business logic points in a 10-year-old codebase, the team sought a way to implement dual writing for MyBatis at the architectural level. * The implementation focused on the MyBatis `Configuration` and `MappedStatement` objects to capture SQL execution without requiring manual updates to individual repository interfaces. * This centralized approach maintained the purity of the business logic and ensured that the dual-write logic could be easily removed once the migration was fully stabilized. For organizations managing large-scale legacy migrations, the dual-write pattern combined with asynchronous transaction synchronization is a highly recommended safety mechanism. Prioritizing the isolation of secondary database failures ensures that the user experience remains unaffected while technical validation is performed in real-time.

toss

Will developers be replaced by AI? (opens in new tab)

The current AI hype cycle is a significant economic bubble where massive infrastructure investments of $560 billion far outweigh the modest $35 billion in generated revenue. However, drawing parallels to the 1995 dot-com era, the author argues that while short-term expectations are overblown, the long-term transformation of the developer role is inevitable. The conclusion is that developers won't be replaced but will instead evolve into "Code Creative Directors" who manage AI through the lens of technical abstraction and delegation. ### The Economic Bubble and Amara’s Law * The industry is experiencing a 16:1 imbalance between AI investment and revenue, with 95% of generative AI implementations reportedly failing to deliver clear efficiency improvements. * Amara’s Law suggests that we are overestimating AI's short-term impact while potentially underestimating its long-term necessity. * Much of the current "AI-driven" job market contraction is actually a result of companies cutting personnel costs to fund expensive GPU infrastructure and AI research. ### Jevons Paradox and the Evolution of Roles * Jevons Paradox indicates that as the "cost" of producing code drops due to AI efficiency, the total demand for software and the complexity of systems will paradoxically increase. * The developer’s identity is shifting from "code producer" to "system architect," focusing on agent orchestration, result verification, and high-level design. * AI functions as a "power tool" similar to game engines, allowing small teams to achieve professional-grade output while amplifying the capabilities of senior engineers. ### Delegation as a Form of Abstraction * Delegating a task to AI is an act of "work abstraction," which involves choosing which low-level details a developer can afford to ignore. * The technical boundary of what is "hard to delegate" is constantly shifting; for example, a complex RAG (Retrieval-Augmented Generation) pipeline built for GPT-4 might become obsolete with the release of a more capable model like GPT-5. * The focus for developers must shift from "what is easy to delegate" to "what *should* be delegated," distinguishing between routine boilerplate and critical human judgment. ### The Risks of Premature Abstraction * Abstraction does not eliminate complexity; it simply moves it into the future. If the underlying assumptions of an AI-generated system change, the abstraction "leaks" or breaks. * Sudden shifts in scaling (traffic surges), regulation (GDPR updates), or security (zero-day vulnerabilities) expose the limitations of AI-delegated work, requiring senior intervention. * Poorly managed AI delegation can lead to "abstraction debt," where the cost of fixing a broken AI-generated system exceeds the cost of having written it manually from the start. To thrive in this environment, developers should embrace AI not as a replacement, but as a layer of abstraction. Success requires mastering the ability to define clear boundaries for AI—delegating routine CRUD operations and boilerplate while retaining human control over architecture, security, and complex business logic.

aws

Announcing Amazon EC2 G7e instances accelerated by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs (opens in new tab)

Amazon has announced the general availability of EC2 G7e instances, a new hardware tier powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs designed for generative AI and high-end graphics. These instances deliver up to 2.3 times the inference performance of their G6e predecessors while providing significant upgrades to memory and bandwidth. This launch aims to provide a cost-effective solution for running medium-sized AI models and complex spatial computing workloads at scale. **Blackwell GPU and Memory Advancements** * The G7e instances feature NVIDIA RTX PRO 6000 Blackwell GPUs, which provide twice the memory and 1.85 times the memory bandwidth of the G6e generation. * Each GPU provides 96 GB of memory, allowing users to run medium-sized models—such as those with up to 70 billion parameters—on a single GPU using FP8 precision. * The architecture is optimized for both spatial computing and scientific workloads, offering the highest graphics performance currently available in the EC2 portfolio. **High-Speed Connectivity and Multi-GPU Scaling** * To support large-scale models, G7e instances utilize NVIDIA GPUDirect P2P, enabling direct communication between GPUs over PCIe interconnects with minimal latency. * These instances offer four times the inter-GPU bandwidth compared to the L40s GPUs found in G6e instances, facilitating more efficient data transfer in multi-GPU configurations. * Total GPU memory can scale up to 768 GB within a single node, supporting massive inference tasks across eight interconnected GPUs. **Networking and Storage Performance** * G7e instances provide up to 1,600 Gbps of network bandwidth, a four-fold increase over previous generations, making them suitable for small-scale multi-node clusters. * Support for NVIDIA GPUDirect Remote Direct Memory Access (RDMA) via Elastic Fabric Adapter (EFA) reduces latency for remote GPU-to-GPU communication. * The instances support GPUDirect Storage with Amazon FSx for Lustre, achieving throughput speeds up to 1.2 Tbps to ensure rapid model loading and data processing. **System Specifications and Configurations** * Under the hood, G7e instances are powered by Intel Emerald Rapids processors and support up to 192 vCPUs and 2,048 GiB of system memory. * Local storage options include up to 15.2 TB of NVMe SSD capacity to handle high-speed data caching and local processing. * The instance family ranges from the g7e.2xlarge (1 GPU, 8 vCPUs) to the g7e.48xlarge (8 GPUs, 192 vCPUs). For developers ready to transition to Blackwell-based architecture, these instances are accessible through AWS Deep Learning AMIs (DLAMI). They represent a major step forward for organizations needing to balance the high memory requirements of modern LLMs with the cost efficiencies of the G-series instance family.

aws

AWS Weekly Roundup: Kiro CLI latest features, AWS European Sovereign Cloud, EC2 X8i instances, and more (January 19, 2026) (opens in new tab)

The January 19, 2026, AWS Weekly Roundup highlights significant advancements in sovereign cloud infrastructure and the general availability of high-performance, memory-optimized compute instances. The update also emphasizes the maturing ecosystem of AI agents, focusing on enhanced developer tooling and streamlined deployment workflows for agentic applications. These releases collectively aim to satisfy stringent regulatory requirements in Europe while pushing the boundaries of enterprise performance and automated productivity. ## Developer Tooling and Kiro CLI Enhancements * New granular controls for web fetch URLs allow developers to use allowlists and blocklists to strictly govern which external resources an agent can access. * The update introduces custom keyboard shortcuts to facilitate seamless switching between multiple specialized agents within a single session. * Enhanced diff views provide clearer visibility into changes, improving the debugging and auditing process for automated workflows. ## AWS European Sovereign Cloud General Availability * Following its initial 2023 announcement, this independent cloud infrastructure is now generally available to all customers. * The environment is purpose-built to meet the most rigorous sovereignty and data residency requirements for European organizations. * It offers a comprehensive set of AWS services within a framework that ensures operational independence and localized data handling. ## High-Performance Computing with EC2 X8i Instances * The memory-optimized X8i instances, powered by custom Intel Xeon 6 processors, have moved from preview to general availability. * These instances feature a sustained all-core turbo frequency of 3.9 GHz, which is currently exclusive to the AWS platform. * The hardware is SAP certified and engineered to provide the highest memory bandwidth and performance for memory-intensive enterprise workloads compared to other Intel-based cloud offerings. ## Agentic AI and Productivity Updates * Amazon Quick Suite continues to expand as a workplace "agentic teammate," designed to synthesize research and execute actions based on organizational insights. * New technical guidance has been released regarding the deployment of AI agents on Amazon Bedrock AgentCore. * The integration of GitHub Actions is now supported to automate the deployment and lifecycle management of these AI agents, bridging the gap between traditional DevOps and agentic AI development. These updates signal a strategic shift toward highly specialized infrastructure, both in terms of regulatory compliance with the Sovereign Cloud and raw performance with the X8i instances. Organizations looking to scale their AI operations should prioritize the new deployment patterns for Bedrock AgentCore to ensure a robust CI/CD pipeline for their autonomous agents.

toss

The story of how I destroyed (opens in new tab)

Toss Payments modernized its inherited legacy infrastructure by building an OpenStack-based private cloud to operate alongside public cloud providers in an Active-Active hybrid configuration. By overcoming extreme technical debt—including servers burdened with nearly 2,000 manual routing entries—the team achieved a cloud-agnostic deployment environment that ensures high availability and cost efficiency. The transformation demonstrates how a small team can successfully implement complex open-source infrastructure through automation and the rigorous technical internalization of Cluster API and OpenStack. ### The Challenge of Legacy Networking - The inherited infrastructure relied on server-side routing rather than network equipment, meaning every server carried its own routing table. - Some legacy servers contained 1,997 individual routing entries, making manual management nearly impossible and preventing efficient scaling. - Initial attempts to solve this via public cloud (AWS) faced limitations, including rising costs due to exchange rates, lack of deep visibility for troubleshooting, and difficulties in disaster recovery (DR) configuration between public and on-premise environments. ### Scaling OpenStack with a Two-Person Team - Despite having only two engineers with no prior OpenStack experience, the team chose the open-source platform to maintain 100% control over the infrastructure. - The team internalized the technology by installing three different versions of OpenStack dozens of times and simulating various failure scenarios. - Automation was prioritized using Ansible and Terraform to manage the lifecycle of VMs and load balancers, enabling new instance creation in under 10 seconds. - Deep technical tuning was applied, such as modifying the source code of the Octavia load balancer to output custom log formats required for their specific monitoring needs. ### High Availability and Monitoring Strategy - To ensure reliability, the team built three independent OpenStack clusters operating in an Active-Active configuration. - This architecture allows for immediate traffic redirection if a specific cluster fails, minimizing the impact on service availability. - A comprehensive monitoring stack was implemented using Zabbix, Prometheus, Mimir, and Grafana to collect and visualize every essential metric across the private cloud. ### Managing Kubernetes with Cluster API - To replicate the convenience of Public Cloud PaaS (like EKS), the team implemented Cluster API to manage the Kubernetes lifecycle. - Cluster API treats Kubernetes clusters themselves as resources within a management cluster, allowing for standardized and rapid deployment across the private environment. - This approach ensures that developers can deploy applications without needing to distinguish between the underlying cloud providers, fulfilling the goal of "cloud-agnostic" infrastructure. ### Practical Recommendation For organizations dealing with massive technical debt or high public cloud costs, the Toss Payments model suggests that a "Private-First" hybrid approach is viable even with limited headcount. The key is to avoid proprietary black-box solutions and instead invest in the technical internalization of open-source tools like OpenStack and Cluster API, backed by a "code-as-infrastructure" philosophy to ensure scalability and reliability.

toss

Toss Income QA Platform: The Beginning (opens in new tab)

Toss's QA team developed an internal "QA Platform" to solve the high barrier to entry associated with using Swagger for manual testing and data setup. By transforming complex, multi-step API calls into a simple, button-based GUI, the team successfully empowered non-QA members to perform self-verification. This shift effectively moved quality assurance from a final-stage bottleneck to a continuous, integrated part of the development process, significantly increasing product delivery speed. ### Lowering the Barrier to Test APIs * Existing Swagger documentation was functionally complete but difficult for developers or planners to use due to the need for manual JSON editing and sequential API execution. * The QA Platform does not create new APIs; instead, it provides a GUI layer over existing Swagger Test APIs to make them accessible without technical documentation. * The system offers two distinct interfaces: "Normal Mode" for simplified, one-click testing and "Swagger Mode" for granular control over request bodies and parameters. ### From Manual Clicks to Automation and Management * Phase 1 focused on visual accessibility, allowing users to trigger complex data states via buttons rather than manual API orchestration. * Phase 2 integrates existing automation scripts into the platform, removing the need for local environment setups and allowing anyone to execute automated test suites. * The final phase aims to transition into a comprehensive Test Management System (TMS) tailored to the team's specific workflow, reducing reliance on third-party external tools. ### Redefining Quality as a Design Choice * By reducing the time and mental effort required to run a test, verification became a frequent, daily habit for the entire product team rather than a chore for the QA department. * Lowering the "cost" of testing replaced guesswork with data-driven confidence, allowing the team to move faster during development. * This initiative reflects a philosophical shift where quality is no longer viewed as a final checklist item but as a core structural element designed into the development lifecycle. The primary takeaway for engineering teams is that the speed of a product is often limited by the friction of its testing process. By building internal tools that democratize testing capabilities—making them available to anyone regardless of their technical role—organizations can eliminate verification delays and foster a culture where quality is a shared responsibility.

gitlab

GitLab Bug Bounty Program policy updates (opens in new tab)

GitLab has updated its HackerOne Bug Bounty program policies to improve transparency and streamline the reporting process for security researchers. These changes emphasize a shift toward local testing environments and provide much-needed clarity on the scope of emerging threats like AI prompt injection and denial-of-service attacks. By refining these guidelines, GitLab aims to protect its production infrastructure while ensuring researchers have clear, objective criteria for submitting high-impact vulnerabilities. ### Enhanced Testing Guidance * GitLab now strongly recommends using the GitLab Development Kit (GDK) for local testing, allowing researchers to experiment with cutting-edge features without risking production stability. * Researchers investigating potential Denial-of-Service (DoS) impacts are advised to use self-managed GitLab instances that meet or exceed standard installation requirements. * Any testing performed on GitLab.com production architecture must utilize test accounts created specifically with the `@wearehackerone.com` email alias. ### Refined Vulnerability Scope * Denial-of-Service (DoS) is generally classified as out of scope, though exceptions exist for application-layer vulnerabilities—such as ReDoS or logic bombs—that cause persistent service disruption via unauthenticated endpoints. * Standalone prompt injection is no longer eligible for bounties unless it serves as a primary vector to achieve security breaches beyond the initial AI boundary. * The policy clarifies the distinction between metadata enumeration and privacy breaches, noting that general information gathering remains out of scope while exposure of confidential data is strictly in scope. ### Transition and Grace Period * To support researchers with ongoing investigations, GitLab is honoring a seven-day grace period for DoS reports submitted before January 22, 2026 (9:00 p.m. PT). * Reports submitted during this window will be evaluated under the previous policy to ensure fairness and maintain trust within the researcher community. Security researchers should immediately update their testing workflows by downloading the GitLab Development Kit and reviewing the updated CVSS calculator on the HackerOne program page to ensure their findings align with the new severity standards.

naver

Analysis of Naver Integrated Search AIB (opens in new tab)

The integration of AI Briefing (AIB) into Naver Search has led to a noticeable increase in Largest Contentful Paint (LCP) values, with p95 metrics rising to approximately 3.1 seconds. This shift is primarily driven by the architectural mismatch between traditional performance metrics and the dynamic, streaming nature of AI chat interfaces. The analysis concludes that while AIB appears to degrade performance on paper, the delay is largely a result of how browsers measure rendering in incremental UI patterns. ### Impact of AIB on Search Performance * Since the introduction of AIB’s chat-based UI in July 2025, LCP p95 has moved beyond the 2.5-second target, showing a direct correlation with AIB traffic volume. * The performance degradation is characterized by a "tail" effect, where a higher percentage of users fall into slower LCP buckets despite stable server response times. * Unlike Google’s AI Overview, which renders in larger blocks, Naver’s AIB uses word-by-word animations and frequent UI updates that place a heavier burden on the browser's rendering engine. ### Client-Side Rendering Bottlenecks * Performance profiling indicates that the delay is localized to the client-side rendering phase rather than the network or server. * Initial rendering includes a skeleton UI period of roughly 900ms, followed by sequential text animations that push the final paint time back. * Comparative data shows that when AIB is the LCP candidate, the p75 value reaches 4.5 seconds—significantly slower than other heavy components like map modules. ### Structural Misalignment with LCP Measurement * **DOM Reconstruction:** After text animations finish, AIB rebuilds the DOM to enable citation highlighting and hover interactions, which triggers Chromium to update the LCP timestamp to this much later point. * **Candidate Fragmentation:** Streaming text at the word level prevents the browser from identifying a single large text block; instead, small, insignificant fragments are often incorrectly selected as the LCP candidate. * **Paint Invalidation:** Chromium’s rendering pipeline treats every new word in a streaming response as a layer update, causing repeated paint invalidations that push the `renderTime` forward frame-by-frame until the entire message is complete. ### New Metrics for AI-Driven Interfaces * To more accurately reflect user experience, Naver is shifting toward Time to First Token (TTFT) as a primary metric for AIB, focusing on how quickly the first meaningful response appears. * Standard LCP remains a valid quality indicator for static search results, but it is no longer treated as a universal benchmark for interactive AI components. * Future performance management will involve more granular distribution analysis and "predictive" performance modeling rather than simply optimizing for a single threshold like the 2.5-second LCP mark. To effectively manage performance in the era of generative AI, organizations should move away from relying solely on LCP for streaming interfaces. Implementing TTFT as a complementary metric provides a better representation of perceived speed, while optimizing the timing of DOM reconstructions can prevent unnecessary measurement delays in Chromium-based browsers.