optimization-algorithms

2 posts

google

Introducing Nested Learning: A new ML paradigm for continual learning (opens in new tab)

Google Research has introduced Nested Learning, a paradigm that treats machine learning models as systems of interconnected, multi-level optimization problems rather than separate architectures and training rules. By unifying structure and optimization through varying update frequencies, this approach aims to mitigate "catastrophic forgetting," the tendency for models to lose old knowledge when acquiring new skills. The researchers validated this framework through "Hope," a self-modifying architecture that outperforms current state-of-the-art models in long-context memory and language modeling. ### The Nested Learning Paradigm This framework shifts the view of machine learning from a single continuous process to a set of coherent, nested optimization problems. Each component within a model is characterized by its own "context flow"—the specific set of information it learns from—and its own update frequency. * The paradigm argues that architecture (structure) and optimization (training rules) are fundamentally the same concept, differing only by their level of computational depth and update rates. * Associative memory is used as a core illustrative concept, where the training process (backpropagation) is modeled as a system mapping data points to local error values. * By defining an update frequency rate for each component, researchers can order these problems into "levels," allowing for a more unified and efficient learning system inspired by the human brain's neuroplasticity. ### Deep Optimizers and Refined Objectives Nested Learning provides a principled way to improve standard optimization algorithms by viewing them through the lens of associative memory modules. * Existing momentum-based optimizers often rely on simple dot-product similarity, which fails to account for how different data samples relate to one another. * By replacing these simple similarities with standard loss metrics, such as L2 regression loss, the researchers derived new formulations for momentum that are more resilient to imperfect or noisy data. * This approach turns the optimizer itself into a deeper learning component with its own internal optimization objective. ### Continuum Memory Systems and the "Hope" Architecture The paradigm addresses the limitations of Large Language Models (LLMs), which are often restricted to either their immediate input window or static pre-trained knowledge. * The researchers developed "Hope," a proof-of-concept architecture that utilizes multi-time-scale updates for its internal components. * While standard Transformers act primarily as short-term memory, the Nested Learning approach allows for "continuum memory" that manages long-context information more effectively. * Experimental results show that this self-modifying architecture achieves superior performance in language modeling compared to existing state-of-the-art models. By recognizing that every part of a model is essentially an optimizer operating at a different frequency, Nested Learning offers a path toward AI that can adapt to new experiences in real-time. This structural shift moves away from the "static pre-training" bottleneck and toward systems capable of true human-like neuroplasticity and lifelong learning.

google

Introducing Mobility AI: Advancing urban transportation (opens in new tab)

Google Research has introduced Mobility AI, a comprehensive program designed to provide transportation agencies with data-driven tools for managing urban congestion, road safety, and evolving transit patterns. By leveraging advancements in measurement, simulation, and optimization, the initiative translates decades of Google’s geospatial research into actionable technologies for infrastructure planning and real-time traffic management. The program aims to empower policymakers and engineers to mitigate gridlock and environmental impacts through high-resolution modeling and continuous monitoring of urban transportation systems. ### Measurement: Understanding Mobility Patterns The measurement pillar focuses on establishing a precise baseline of current transportation conditions using real-time and historical data. * **Congestion Functions:** Researchers utilize machine learning and floating car data to develop city-wide models that mathematically describe the relationship between vehicle volume and travel speeds, even on roads with limited data. * **Geospatial Foundation Models:** By applying self-supervised learning to movement patterns, the program creates embeddings that capture local spatial characteristics. This allows for better reasoning about urban mobility in data-sparse environments. * **Analytical Formulation:** Specific research explores how adjusting traffic signal timing influences the distribution of flow across urban networks, revealing patterns in how congestion propagates. ### Simulation: Forecasting and Scenario Analysis Mobility AI uses simulation technologies to create digital twins of cities, allowing planners to test interventions before implementing them physically. * **Traffic Simulation API:** This tool enables the modeling of complex "what-if" scenarios, such as the impact of closing a major bridge or reconfiguring lane assignments on a highway. * **High-Fidelity Calibration:** The simulations are calibrated using large-scale, real-world data to ensure that the virtual models accurately reflect local driver behavior and infrastructure constraints. * **Scalable Evaluation:** These digital environments provide a risk-free way to assess how new developments, such as the rise of autonomous vehicles or e-commerce logistics, will reshape existing traffic patterns. ### Optimization: Improving Urban Flow The optimization pillar focuses on applying AI to solve large-scale coordination problems, such as signal timing and routing efficiency. * **Project Green Light:** This initiative uses AI to provide traffic signal timing recommendations to city engineers, specifically targeting a reduction in stop-and-go traffic to lower greenhouse gas emissions. * **System-Wide Coordination:** Optimization algorithms work to balance the needs of multiple modes of transport, including public transit, cycling, and pedestrian infrastructure, rather than focusing solely on personal vehicles. * **Integration with Google Public Sector:** Research breakthroughs from this program are being integrated into Google Maps Platform and Google Public Sector tools to provide agencies with accessible, enterprise-grade optimization capabilities. Transportation agencies and researchers can leverage these foundational AI technologies to transition from reactive traffic management to proactive, data-driven policymaking. By participating in the Mobility AI program, public sector leaders can gain access to advanced simulation and measurement tools designed to build more resilient and efficient urban mobility networks.