네이버 / performance-optimization

2 posts

naver

@RequestCache: HTTP 요청 범위 캐싱을 위한 커스텀 애너테이션 개발기 (opens in new tab)

The development of `@RequestCache` addresses the performance degradation and network overhead caused by redundant external API calls or repetitive computations within a single HTTP request. By implementing a custom Spring-based annotation, developers can ensure that specific data is fetched only once per request and shared across different service layers. This approach provides a more elegant and maintainable solution than manual parameter passing or struggling with the limitations of global caching strategies. ### Addressing Redundant Operations in Web Services * Modern web architectures often involve multiple internal services (e.g., Order, Payment, and Notification) that independently request the same data, such as a user profile. * These redundant calls increase response times, put unnecessary load on external servers, and waste system resources. * `@RequestCache` provides a declarative way to cache method results within the scope of a single HTTP request, ensuring the actual logic or API call is executed only once. ### Limitations of Manual Data Passing * The common alternative of passing response objects as method parameters leads to "parameter drilling," where intermediate service layers must accept data they do not use just to pass it to a deeper layer. * In the "Strategy Pattern," adding a new data dependency to an interface forces every implementation to change, even those that have no use for the new parameter, which violates clean architecture principles. * Manual passing makes method signatures brittle and increases the complexity of refactoring as the call stack grows. ### The TTL Dilemma in Traditional Caching * Using Redis or a local cache with Time-To-Live (TTL) settings is often insufficient for request-level isolation. * If the TTL is set too short, the cache might expire before a long-running request finishes, leading to the very redundant calls the system was trying to avoid. * If the TTL is too long, the cache persists across different HTTP requests, which is logically incorrect for data that should be fresh for every new user interaction. ### Leveraging Spring’s Request Scope and Proxy Mechanism * The implementation utilizes Spring’s `@RequestScope` to manage the cache lifecycle, ensuring that data is automatically cleared when the request ends. * Under the hood, `@RequestScope` uses a Singleton Proxy that delegates calls to a specific instance stored in the `RequestContextHolder` for the current thread. * The cache relies on `RequestAttribute`, which uses `ThreadLocal` storage to guarantee isolation between different concurrent requests. * Lifecycle management is handled by Spring’s `FrameworkServlet`, which prevents memory leaks by automatically cleaning up request attributes after the response is sent. For applications dealing with deep call stacks or complex service interactions, a request-scoped caching annotation provides a robust way to optimize performance without sacrificing code readability. This mechanism is particularly recommended when the same data is needed across unrelated service boundaries within a single transaction, ensuring consistency and efficiency throughout the request lifecycle.

naver

네이버 TV (opens in new tab)

JVM applications often suffer from initial latency spikes because the Just-In-Time (JIT) compiler requires a "warm-up" period to optimize frequently executed code into machine language. While traditional strategies rely on simulated API calls to trigger this optimization, these methods often introduce side effects like data pollution, log noise, and increased maintenance overhead. This new approach advocates for a library-centric warm-up that targets core execution paths and dependencies directly, ensuring high performance from the first real request without the risks of full-scale API simulation. ### Limitations of Traditional API-Based Warm-up * **Data and State Pollution:** Simulated API calls can inadvertently trigger database writes, send notifications, or pollute analytics data, requiring complex logic to bypass these side effects. * **Maintenance Burden:** As business logic and API signatures change, developers must constantly update the warm-up scripts or "dummy" requests to match the current application state. * **Operational Risk:** Relying on external dependencies or complex internal services during the warm-up phase can lead to deployment failures if the mock environment is not perfectly aligned with production. ### The Library-Centric Warm-up Strategy * **Targeted Optimization:** Instead of hitting the entry-point controllers, the focus shifts to warming up heavy third-party libraries and internal utility classes (e.g., JSON parsers, encryption modules, and DB drivers). * **Internal Execution Path:** By directly invoking methods within the application's service or infrastructure layer during the startup phase, the JIT compiler can reach "Tier 4" (C2) optimization for critical code blocks. * **Decoupled Logic:** Because the warm-up targets underlying libraries rather than specific business endpoints, the logic remains stable even when the high-level API changes. ### Implementation and Performance Verification * **Reflection and Hooks:** The implementation uses application startup hooks to execute intensive code paths, ensuring the JVM is "hot" before the load balancer begins directing traffic to the instance. * **JIT Compilation Monitoring:** Success is measured by tracking the number of JIT-compiled methods and the time taken to reach a stable state, specifically targeting the reduction of "cold" execution time. * **Latency Improvements:** Empirical data shows a significant reduction in P99 latency during the first few minutes of deployment, as the most CPU-intensive library functions are already pre-optimized. ### Advantages and Practical Constraints * **Safer Deployments:** Removing the need for simulated network requests makes the deployment process more robust and prevents accidental side effects in downstream systems. * **Granular Control:** Developers can selectively warm up only the most performance-sensitive parts of the application, saving startup time compared to a full-system simulation. * **Incomplete Path Coverage:** A primary limitation is that library-only warming may miss specific branch optimizations that occur only during full end-to-end request processing. To achieve the best balance between safety and performance, engineering teams should prioritize warming up shared infrastructure libraries and high-overhead utilities. While it may not cover 100% of the application's execution paths, a library-based approach provides a more maintainable and lower-risk foundation for JVM performance tuning than traditional request-based methods.