scalability

3 posts

toss

Rethinking Design Systems (opens in new tab)

Toss Design System (TDS) argues that as organizations scale, design systems often become a source of friction rather than efficiency, leading teams to bypass them through "forking" or "detaching" components. To prevent this, TDS treats the design system as a product that must adapt to user demand rather than a set of rigid constraints to be enforced. By shifting from a philosophy of control to one of flexible expansion, they ensure that the system remains a helpful tool rather than an obstacle. ### The Limits of Control and System Fragmentation * When a design system is too rigid, product teams often fork packages to make minor adjustments, which breaks the link to central updates and creates UI inconsistencies. * Treating "system bypasses" as user errors is ineffective; instead, they should be viewed as unmet needs in the system's "supply." * The goal of a modern design system should be to reduce the reason to bypass the system by providing natural extension points. ### Comparing Flat and Compound API Patterns * **Flat Pattern:** These components hide internal structures and use props to manage variations (e.g., `title`, `description`). While easy to use, they suffer from "prop bloat" as more edge cases are added, making long-term maintenance difficult. * **Compound Pattern:** This approach provides sub-components (e.g., `Card.Header`, `Card.Body`) for the user to assemble manually. This offers high flexibility for unexpected layouts but increases the learning curve and the amount of boilerplate code required. ### The Hybrid API Strategy * TDS employs a hybrid approach, offering both Flat APIs for common, simple use cases and Compound APIs for complex, customized needs. * Developers can choose a `FlatCard` for speed or a `Compound Card` when they need to inject custom elements like badges or unique button placements. * To avoid the burden of maintaining two separate codebases, TDS uses a "primitive" layer where the Flat API is simply a pre-assembled version of the Compound components. Design systems should function as guardrails that guide developers toward consistency, rather than fences that stop them from solving product-specific problems. By providing flexible architecture that supports exceptions, a system can maintain its relevance and ensure that teams stay within the ecosystem even as their requirements evolve.

netflix

Behind the Streams: Real-Time Recommendations for Live Events Part 3 | by Netflix Technology Blog | Netflix TechBlog (opens in new tab)

Netflix manages the massive surge of concurrent users during live events by utilizing a hybrid strategy of prefetching and real-time broadcasting to deliver synchronized recommendations. By decoupling data delivery from the live trigger, the system avoids the "thundering herd" effect that would otherwise overwhelm cloud infrastructure during record-breaking broadcasts. This architecture ensures that millions of global devices receive timely updates and visual cues without requiring linear, inefficient scaling of compute resources. ### The Constraint Optimization Problem To maintain a seamless experience, Netflix engineers balance three primary technical constraints: time to update, request throughput, and compute cardinality. * **Time:** The specific duration required to coordinate and push a recommendation update to the entire global fleet. * **Throughput:** The maximum capacity of cloud services to handle incoming requests without service degradation. * **Cardinality:** The variety and complexity of unique requests necessary to serve personalized updates to different user segments. ### Two-Phase Recommendation Delivery The system splits the delivery process into two distinct stages to smooth out traffic spikes and ensure high availability. * **Prefetching Phase:** While members browse the app normally before an event, the system downloads materialized recommendations, metadata, and artwork into the device's local cache. * **Broadcasting Phase:** When the event begins, a low-cardinality "at least once" message is broadcast to all connected devices, triggering them to display the already-cached content instantaneously. * **Traffic Smoothing:** This approach eliminates the need for massive, real-time data fetches at the moment of kickoff, distributing the heavy lifting of data transfer over a longer period. ### Live State Management and UI Synchronization A dedicated Live State Management (LSM) system tracks event schedules in real time to ensure the user interface stays perfectly in sync with the production. * **Dynamic Adjustments:** If a live event is delayed or ends early, the LSM adjusts the broadcast triggers to preserve accuracy and prevent "spoilers" or dead links. * **Visual Cues:** The UI utilizes "Live" badging and dynamic artwork transitions to signal urgency and guide users toward the stream. * **Frictionless Playback:** For members already on a title’s detail page, the system can trigger an automatic transition into the live player the moment the broadcast begins, reducing navigation latency. To support global-scale live events, technical teams should prioritize edge-heavy strategies that pre-position assets on client devices. By shifting from a reactive request-response model to a proactive prefetch-and-trigger model, platforms can maintain high performance and reliability even during the most significant traffic peaks.

google

Securing private data at scale with differentially private partition selection (opens in new tab)

Google Research has introduced a novel parallel algorithm called MaxAdaptiveDegree (MAD) to enhance differentially private (DP) partition selection, a critical process for identifying common data items in massive datasets without compromising individual privacy. By utilizing an adaptive weighting mechanism, the algorithm optimizes the utility-privacy trade-off, allowing researchers to safely release significantly more data than previous non-adaptive methods. This breakthrough enables privacy-preserving analysis on datasets containing hundreds of billions of items, scaling up to three orders of magnitude larger than existing sequential approaches. ## The Role of DP Partition Selection * DP partition selection identifies a meaningful subset of unique items from large collections based on their frequency across multiple users. * The process ensures that no single individual's data can be identified in the final list by adding controlled noise and filtering out items that are not sufficiently common. * This technique is a foundational step for various machine learning tasks, including extracting n-gram vocabularies for language models, analyzing private data streams, and increasing efficiency in private model fine-tuning. ## The Weight, Noise, and Filter Paradigm * The standard approach to private partition selection begins by computing a "weight" for each item, typically representing its frequency, while ensuring "low sensitivity" so no single user has an outsized impact. * Random Gaussian noise is added to these weights to obfuscate exact counts, preventing attackers from inferring the presence of specific individuals. * A threshold determined by DP parameters is then applied; only items whose noisy weights exceed this threshold are included in the final output. ## Improving Utility via Adaptive Weighting * Traditional non-adaptive methods often result in "wastage," where highly popular items receive significantly more weight than necessary to cross the selection threshold. * The MaxAdaptiveDegree (MAD) algorithm introduces adaptivity by identifying items with excess weight and rerouting that weight to "under-allocated" items sitting just below the threshold. * This strategic reallocation allows a larger number of less-frequent items to be safely released, significantly increasing the utility of the dataset without compromising privacy or computational efficiency. ## Scalability and Parallelization * Unlike sequential algorithms that process data one piece at a time, MAD is designed as a parallel algorithm to handle the scale of modern user-based datasets. * The algorithm can process datasets with hundreds of billions of items by breaking the problem down into smaller parts computed simultaneously across multiple processors. * Google has open-sourced the implementation on GitHub to provide the research community with a tool that maintains robust privacy guarantees even at a massive scale. Researchers and data scientists working with large-scale sensitive datasets should consider implementing the MaxAdaptiveDegree algorithm to maximize the amount of shareable data while strictly adhering to user-level differential privacy standards.