Meta

20 posts

engineering.fb.com

Filter by tag

meta

Adapting the Facebook Reels RecSys AI Model Based on User Feedback (opens in new tab)

Meta has enhanced the Facebook Reels recommendation engine by shifting focus from traditional engagement signals, like watch time and likes, to direct user feedback. By implementing the User True Interest Survey (UTIS) model, the system now prioritizes content that aligns with genuine user preferences rather than just short-term interactions. This shift has resulted in significant improvements in recommendation relevance, high-quality content delivery, and long-term user retention. **Limitations of Engagement-Based Metrics** * Traditional signals like "likes" and "watch time" are often noisy and may not reflect a user’s actual long-term interests. * Models optimized solely for engagement tend to favor short-term value over the long-term utility of the product. * Internal research found that previous heuristic-based interest models only achieved 48.3% precision in identifying what users truly care about. * Effective interest matching requires understanding nuanced factors such as production style, mood, audio, and motivation, which implicit signals often miss. **The User True Interest Survey (UTIS) Model** * Meta collects direct feedback via randomized, single-question surveys asking users to rate video interest on a 1–5 scale. * The raw survey data is binarized to denoise responses and weighted to correct for sampling and nonresponse bias. * The UTIS model functions as a lightweight "alignment model layer" built on top of the main multi-task ranking system. * The architecture uses existing model predictions as input features, supplemented by engineered features that capture content attributes and user behavior. **Integration into the Ranking Funnel** * **Late Stage Ranking (LSR):** The UTIS score is used as an additional input feature in the final value formula, allowing the system to boost high-interest videos and demote low-interest ones. * **Early Stage Ranking (Retrieval):** The model aggregates survey data to reconstruct user interest profiles, helping the system source more relevant candidates during the initial retrieval phase. * **Knowledge Distillation:** Large sequence-based retrieval models are aligned using UTIS predictions as labels through distillation objectives. **Performance and Impact** * The deployment of UTIS has led to a measurable increase in the delivery of niche, high-quality content. * Generic, popularity-based recommendations that often lack depth have been reduced. * Meta observed robust improvements across core metrics, including higher follow rates, more shares, and increased user retention. * The system now offers better interpretability, allowing engineers to understand which specific factors contribute to a user’s sense of "interest match." To continue improving the Reels ecosystem, Meta is focusing on doubling down on personalization by tackling challenges related to sparse data and sampling bias while exploring more advanced AI architectures to further diversify recommendations.

meta

CSS at Scale With StyleX (opens in new tab)

Scaling CSS within massive codebases presents unique challenges that traditional styling methods often struggle to solve effectively. Meta’s StyleX addresses these issues by offering a system that combines the intuitive ergonomics of CSS-in-JS with the runtime performance of static CSS. By prioritizing atomic styling and definition deduplication, StyleX minimizes bundle sizes and has become the primary styling standard across Meta's entire suite of applications. ### Performance-Driven Styling Architecture * Combines a CSS-in-JS developer experience with a compiler that outputs static CSS to ensure high performance and zero runtime overhead. * Utilizes atomic styling to break down CSS into small, reusable classes, which prevents style sheets from growing linearly with the size of the codebase. * Automatically deduplicates style definitions during the build process, significantly reducing the final bundle size delivered to the client. * Exposes a simple, consistent API that allows developers to manage complex styles and themes while maintaining type safety. ### Standardization and Industry Adoption * Serves as the foundational styling system for Meta’s most prominent platforms, including Facebook, Instagram, WhatsApp, Messenger, and Threads. * Gained significant industry traction beyond Meta, with large-scale organizations such as Figma and Snowflake adopting it for their own web applications. * Acts as an open-source force multiplier, allowing Meta engineers and the broader community to collaborate on solving CSS-at-scale problems. * Provides a mature ecosystem that bridges the gap between the flexibility of JavaScript-based styling and the efficiency of traditional CSS. For engineering teams managing large-scale web applications where bundle size and styling maintainability are critical, StyleX offers a battle-tested framework. Developers can leverage this tool to achieve the performance of static CSS without losing the expressive power of modern JavaScript tooling.

meta

Python Typing Survey 2025: Code Quality and Flexibility As Top Reasons for Typing Adoption (opens in new tab)

The 2025 Typed Python Survey highlights that type hinting has transitioned from an optional feature to a core development standard, with 86% of respondents reporting frequent usage. While mid-career developers show the highest enthusiasm for typing, the ecosystem faces ongoing friction from tooling fragmentation and the complexity of advanced type logic. Overall, the community is pushing for a more robust system that mirrors the expressive power of TypeScript while maintaining Python’s hallmark flexibility. ## Respondent Demographics and Adoption Trends * The survey analyzed responses from 1,241 developers, the majority of whom are highly experienced, with nearly half reporting over a decade of Python expertise. * Adoption is highest among developers with 5–10 years of experience (93%), whereas junior developers (83%) and those with over 10 years of experience (80%) show slightly lower usage rates. * The lower adoption among seniors is attributed to the management of legacy codebases and long-standing habits formed before type hints were introduced to the language. ## Primary Drivers for Typing Adoption * **Incremental Integration:** Developers value the "gradual typing" approach, which allows them to add types to existing projects at their own pace without breaking the codebase. * **Improved Tooling and IDE Support:** Typing significantly enhances developer experience by enabling more accurate autocomplete, jump-to-definition, and inline documentation in IDEs. * **Bug Prevention and Readability:** Type hints act as living documentation that helps catch subtle bugs during refactoring and makes complex codebases easier for teams to reason about. * **Library Compatibility:** Features like Protocols and Generics are highly appreciated, particularly for their synergy with modern libraries like Pydantic and FastAPI that utilize type annotations at runtime. ## Technical Pain Points and Ecosystem Friction * **Third-Party Integration:** A major hurdle is the inconsistent quality or total absence of type stubs in massive libraries like NumPy, Pandas, and Django. * **Tooling Fragmentation:** Developers expressed frustration over inconsistencies between major type checkers like Mypy and Pyright, as well as the slow performance of Mypy in large projects. * **Conceptual Complexity:** Advanced features such as variance (co/contravariance), decorators, and complex nested Generics remain difficult for many developers to implement correctly. * **Runtime Limitations:** Because Python does not enforce types at the interpreter level, some developers find it difficult to justify the verbosity of typing when it offers no native runtime guarantees. ## Most Requested Type System Enhancements * **TypeScript Parity:** There is a strong demand for features found in TypeScript, specifically Intersection types (using the `&` operator), Mapped types, and Conditional types. * **Utility Types:** Developers are looking for built-in utilities like `Pick`, `Omit`, and `keyof` to handle dictionary shapes more effectively. * **Improved Structural Typing:** While `TypedDict` exists, respondents want more flexible, anonymous structural typing to handle complex data structures without excessive boilerplate. * **Performance and Enforcement:** There is a recurring request for an official, high-performance built-in type checker and optional runtime enforcement to bridge the gap between static analysis and execution. As the Python type system continues to mature, developers should prioritize incremental adoption in shared libraries and internal APIs to maximize the benefits of static analysis. While waiting for more advanced features like intersection types, focusing on tooling consistency—such as aligning team standards around a specific type checker—can mitigate much of the friction identified in the 2025 survey.

meta

DrP: Meta's Root Cause Analysis Platform at Scale (opens in new tab)

DrP is Meta’s programmatic root cause analysis (RCA) platform designed to automate incident investigations and reduce the burden of manual on-call tasks. By codifying investigation playbooks into executable "analyzers," the platform significantly lowers the mean time to resolve (MTTR) by 20% to 80% for over 300 teams. This systematic approach replaces outdated manual scripts with a scalable backend that executes 50,000 automated analyses daily, providing immediate context when alerts fire. ## Architecture and Core Components * **Expressive SDK:** Provides a framework for engineers to codify investigation workflows into "analyzers," utilizing a rich library of helper functions and machine learning algorithms. * **Built-in Analysis Tools:** The platform includes native support for anomaly detection, event isolation, time-series correlation, and dimension analysis to identify specific problem areas. * **Scalable Backend:** A multi-tenant execution environment manages a worker pool that handles thousands of requests securely and asynchronously. * **Workflow Integration:** DrP is integrated directly into Meta’s internal alerting and incident management systems, allowing for automatic triggering without human intervention. ## Authoring and Verification Workflow * **Template Bootstrapping:** Engineers use the SDK to generate boilerplate code that captures required input parameters and context in a type-safe manner. * **Analyzer Chaining:** The system allows for seamless dependency analysis by passing context between different analyzers, enabling investigations to span multiple interconnected services. * **Automated Backtesting:** Before deployment, analyzers undergo automated backtesting integrated into the code review process to ensure accuracy and performance. * **Decision Tree Logic:** Investigation steps are modeled as decision trees within the code, allowing the analyzer to follow different paths based on the data it retrieves. ## Execution and Post-Processing * **Trigger-based Analysis:** When an alert is activated, the backend automatically queues the relevant analyzer, ensuring findings are available as soon as an engineer begins triaging. * **Automated Mitigation:** A post-processing system can take direct action based on investigation results, such as creating tasks or submitting pull requests to resolve identified issues. * **DrP Insights:** This system periodically reviews historical analysis outputs to identify and rank the top causes of alerts, helping teams prioritize long-term reliability fixes. * **Alert Annotation:** Results are presented in both human-readable text and machine-readable formats, directly annotating the incident logs for the on-call responder. ## Practical Conclusion Organizations managing large-scale distributed systems should transition from static markdown playbooks to executable investigation code. By implementing a programmatic RCA framework like DrP, teams can scale their troubleshooting expertise and significantly reduce "on-call fatigue" by automating the repetitive triage steps that typically consume the first hour of an incident.