black-box-optimization

2 posts

meta

Efficient Optimization With Ax, an Open Platform for Adaptive Experimentation (opens in new tab)

Meta has released Ax 1.0, an open-source platform designed to automate and optimize complex, resource-intensive experimentation through machine learning. By utilizing Bayesian optimization, the platform helps researchers navigate vast configuration spaces to improve AI models, infrastructure, and hardware design efficiently. The release aims to bridge the gap between sophisticated mathematical theory and the practical requirements of production-scale engineering. ## Real-World Experimentation and Utility * Ax is used extensively at Meta for diverse tasks, including tuning hyperparameter configurations, discovering optimal data mixtures for Generative AI, and optimizing compiler flags. * The platform is built to handle the logistical "overhead" of experimentation, such as managing experiment states, automating orchestration, and providing diagnostic tools. * It supports multi-objective optimization, allowing users to balance competing metrics and enforce "guardrail" constraints rather than just maximizing a single value. * Applications extend beyond software to physical engineering, such as optimizing design parameters for AR/VR hardware. ## System Insight and Analysis * Beyond finding optimal points, Ax serves as a diagnostic tool to help researchers understand the underlying behavior of their systems. * It includes built-in visualizations for Pareto frontiers, which illustrate the trade-offs between different metrics. * Sensitivity analysis tools identify which specific input parameters have the greatest impact on the final results. * The platform provides automated plots and tables to track optimization progress and visualize the effect of parameters across the entire input space. ## Technical Methodology and Architecture * Ax utilizes Bayesian optimization, an iterative approach that balances "exploration" (sampling new areas) with "exploitation" (refining known good areas). * The platform relies on **BoTorch** for its underlying Bayesian components and typically employs **Gaussian processes (GP)** as surrogate models. * GPs are preferred because they can make accurate predictions and quantify uncertainty even when provided with very few data points. * The system uses an **Expected Improvement (EI)** acquisition function to calculate the potential value of new configurations compared to the current best-known result. * This surrogate-based approach is designed to scale to high-dimensional settings involving hundreds of tunable parameters where traditional search methods are too costly. To begin implementing these methods, developers can install the platform via `pip install ax-platform`. Ax 1.0 provides a robust framework for moving cutting-edge optimization research directly into production environments.

line

How to evaluate AI-generated images? (opens in new tab)

LY Corporation is developing a text-to-image pipeline to automate the creation of branded character illustrations, aiming to reduce the manual workload for designers. The project focuses on utilizing Stable Diffusion and Flow Matching models to generate high-quality images that strictly adhere to specific corporate style guidelines. By systematically evaluating model architectures and hyperparameters, the team seeks to transform subjective image quality into a quantifiable and reproducible technical process. ### Evolution of Image Generation Models * **Diffusion Models:** These models generate images through a gradual denoising process. They use a forward process to add Gaussian noise via a Markov chain and a reverse process to restore the original image based on learned probability distributions. * **Stable Diffusion (SD):** Unlike standard diffusion that operates in pixel space, SD works within a "latent space" using a Variational Autoencoder (VAE). This significantly reduces computational load by denoising latent vectors rather than raw pixels. * **SDXL and SD3.5:** SDXL improves prompt comprehension by adding a second text encoder (CLIP-G/14). SD3.5 introduces a major architectural shift by moving from diffusion to "Flow Matching," utilizing a Multimodal Diffusion Transformer (MMDiT) that handles text and image modalities in a single block for better parameter efficiency. * **Flow Matching:** This approach treats image generation as a deterministic movement through a vector field. Instead of removing stochastic noise, it learns the velocity required to transform a simple probability distribution into a complex data distribution. ### Core Hyperparameters for Output Control * **Seeds and Latent Vectors:** The seed is the integer value that determines the initial random noise. Since Stable Diffusion operates in latent space, this noise is essentially the starting latent vector that dictates the basic structure of the final image. * **Prompts:** Textual inputs serve as the primary guide for the denoiser. Models are trained on image-caption pairs, allowing the U-Net or Transformer blocks to align the visual output with the user’s descriptive intent. * **Classifier-Free Guidance (CFG):** This parameter adjusts the weight of the prompt's influence. It calculates the difference between noise predicted with a prompt and noise predicted without one (or with a negative prompt), allowing users to control how strictly the model follows the text instructions. ### Practical Recommendation To achieve consistent results that match a specific brand identity, it is insufficient to rely on prompts alone; developers should implement automated hyperparameter search and black-box optimization. Transitioning to Flow Matching models like SD3.5 can provide a more deterministic generation path, which is critical when attempting to scale the production of high-quality, branded assets.