google

Bringing 3D shoppable products online with generative AI (opens in new tab)

Google has developed a series of generative AI techniques to transform standard 2D product images into immersive, interactive 3D visualizations for online shopping. By evolving from early neural reconstruction methods to state-of-the-art video generation models like Veo, Google can now produce high-quality 360-degree spins from as few as three images. This progression significantly reduces the cost and complexity for businesses to create shoppable 3D experiences at scale across diverse product categories. ## First Generation: Neural Radiance Fields (NeRFs) * Launched in 2022, this initial approach utilized NeRF technology to synthesize novel views and 360° spins, specifically for footwear on Google Search. * The system required five or more images and relied on complex sub-processes, including background removal, XYZ prediction (NOCS), and camera position estimation. * While a breakthrough, the technology struggled with "noisy" signals and complex geometries, such as the thin structures found in sandals or high heels. ## Second Generation: View-Conditioned Diffusion * Introduced in 2023, this version addressed previous limitations by using a diffusion-based architecture to predict unseen viewpoints from limited data. * The model utilized Score Distillation Sampling (SDS), which compares rendered 3D models against generated targets to iteratively refine parameters for better realism. * This approach allowed Google to scale 3D visualizations to the majority of shoes viewed on Google Shopping, handling more diverse and difficult footwear styles. ## Third Generation: Generalizing with Veo * The current advancement leverages Google’s Veo video generation model to transform product images into consistent, high-fidelity 360° videos. * By training on millions of synthetic 3D assets, Veo captures complex interactions between light, texture, and geometry, making it effective for shiny surfaces and diverse categories like electronics and furniture. * This method removes the need for precise camera pose estimation, increasing reliability across different environments. * While the model can generate a 3D representation from a single image by "hallucinating" missing details, using three images significantly reduces errors and ensures high-fidelity accuracy. These technological milestones mark a shift from specialized 3D reconstruction toward generalized AI models that make digital products feel tangible and interactive for consumers.

line

Code Quality Improvement Techniques Part 1 (opens in new tab)

Maintaining a clear separation of concerns between software layers requires avoiding implicit dependencies where one layer relies on the specific implementation details of another. When different components share "hidden" knowledge—such as a repository fetching extra data specifically to trigger a UI state—the code becomes fragile and difficult to maintain. By passing explicit information through data models, developers can decouple these layers and ensure that changes in one do not inadvertently break the other. ### The Risks of Implicit Layer Dependency When layers share implicit logic, such as a repository layer knowing the specific display requirements of the UI, the architecture becomes tightly coupled and prone to bugs. * In the initial example, the repository fetches `MAX + 1` items specifically because the UI needs to display a "+" sign if more items exist. * This creates a dependency where the UI logic for displaying counts relies entirely on the repository's internal fetching behavior. * Code comments that explain one layer's behavior in the context of another (e.g., `// +1 is for the UI`) are a "code smell" indicating that responsibilities are poorly defined. ### Decoupling Through Explicit State The most effective way to separate these concerns is to modify the data model to carry explicit state information, removing the need for "magic numbers" or leaked logic. * By adding a boolean property like `hasMoreItems` to the `StoredItems` model, the repository can explicitly communicate the existence of additional data. * The repository handles the logic of fetching `limit + 1`, determining the boolean state, and then truncating the list to the correct size before passing it up. * The UI layer becomes "dumb" and only reacts to the provided data; it no longer needs to know about the `MAX_COUNT` constant or the repository's fetching strategy to determine its display state. ### Strategic Placement of Logic and Constants Determining where constants like `ITEM_LIST_MAX_COUNT` should reside is a key architectural decision that impacts code reuse and clarity. * **Business Logic Layer:** Placing such constants in a dedicated Domain or Use Case layer is often the best approach for maintaining a clean architecture. * **Model Classes:** If a separate logic layer is too complex for the project scale, the constant can be housed within the model class (e.g., using a companion object in Kotlin). * **Dependency Direction:** Developers must ensure that functional logic does not leak into generic data models, as this can create confusing dependencies where a general-purpose model becomes tied to a specific feature's algorithm. Effective software design relies on components maintaining a "proper distance" from one another. To improve code quality, favor explicit flags and clear data contracts over implicit assumptions about how different layers of the stack will interact.

discord

STAR WARS™ Makes Its Way to Discord (opens in new tab)

Discord has partnered with Lucasfilm to introduce a new Star Wars themed collection of Avatar Decorations and Profile Effects to the platform's Shop. This collaboration draws inspiration from iconic cinematic moments, such as Darth Vader’s appearance in *Rogue One*, to offer high-quality customization options for fans. The release allows users to personalize their profiles with animations that celebrate the legacy of both the light and dark sides of the Force. **Collaborative Design and Inspiration** * The collection was developed through a direct collaboration between Discord’s in-house creative team and Lucasfilm to ensure authentic representation of the franchise. * Visual designs are intended to evoke specific emotional responses, such as the tension of a Sith Lord’s presence or the inspiration of heroic Jedi. * The "Darth Vader Arrives" profile effect specifically references the ominous red glow of the hallway scene from *Rogue One: A Star Wars Story*. **Available Decorations and Effects** * **Avatar Decorations**: The shop now includes specific frame animations such as two variants of Lightsabers, R2-D2 on Tatooine, a Space Battle, the Millennium Falcon Hyperdrive, Yoda on Dagobah, and a BB-8 animation. * **Profile Effects**: These full-profile animations feature specialized visuals including two variants of Lightsaber Mastery, Entering Hyperspace, and the Darth Vader Arrives effect. * These items are designed to fit seamlessly over standard Discord profile layouts to enhance user presence in group chats and servers. **Platform Integration and Access** * The Star Wars collection is accessible via the Discord Shop on desktop or through the "You" tab on the mobile application. * Discord Nitro members receive a specialized discount on all items within the collection, and these discounts also apply when purchasing decorations or effects as gifts for others. * Users requiring technical assistance with these new assets can refer to the platform's dedicated support documentation for troubleshooting. To explore these new customization options, users should navigate to the Discord Shop on their preferred device. Nitro subscribers should ensure they are logged in before purchasing to take advantage of the member-only pricing available for this limited collection.

google

A new light on neural connections (opens in new tab)

Google and the Institute of Science and Technology Austria (ISTA) have developed LICONN, the first light-microscopy-based method capable of comprehensively mapping neurons and their connections in brain tissue. This approach overcomes the traditional reliance on expensive electron microscopy by utilizing physical tissue expansion and advanced machine learning to achieve comparable resolution and accuracy. The researchers successfully validated the technique by reconstructing nearly one million cubic microns of mouse cortex, demonstrating that light microscopy can now achieve "dense" connectomics at scale. ## Overcoming Resolution and Cost Barriers * Connectomics has traditionally relied on electron microscopy (EM) because it offers nanometer-scale resolution, whereas standard light microscopy is limited by the diffraction limit of visible light. * Electron microscopes cost millions of dollars and require specialized training, restricting high-level neuroscience research to wealthy, large-scale institutions. * LICONN provides a more accessible alternative by utilizing standard light microscopy equipment already found in most life science laboratories. ## Advanced Tissue Expansion and Labeling * The project uses a specialized expansion microscopy protocol where brain tissue is embedded in hydrogels that absorb water and physically swell. * The technique employs three different hydrogels to create interweaving polymer networks that expand the tissue by 16 times in each dimension while preserving structural integrity. * A whole-protein labeling process is used to provide the necessary image contrast, allowing for the tracing of densely packed neurites and the detection of synapses. ## Automated Reconstruction and Validation * Google applied its established suite of machine learning and image analysis tools to automate the reconstruction of the expanded tissue samples. * The team verified the accuracy of the method by tracing approximately 0.5 meters of neurites within mouse hippocampus tissue, confirming results comparable to electron microscopy. * In a large-scale validation, the researchers provided an automated reconstruction of a volume of mouse cortex totaling nearly one million cubic microns. ## Integration of Molecular and Structural Data * One of LICONN’s primary advantages over electron microscopy is its ability to capture multiple light wavelengths simultaneously. * Researchers can use fluorescent markers to visualize specific proteins, neurotransmitters, and other molecules within the structural map. * This dual-layered approach allows scientists to align molecular information with physical neuronal pathways, offering new insights into how brain circuits drive behavior and cognition. LICONN represents a significant shift in neuroscience by democratizing high-resolution brain mapping. By replacing expensive hardware requirements with sophisticated chemical protocols and machine learning, this method enables a wider range of laboratories to contribute to the global effort of mapping the brain’s intricate wiring.

google

Making complex text understandable: Minimally-lossy text simplification with Gemini (opens in new tab)

Google Research has introduced a novel system using Gemini models to perform minimally-lossy text simplification, a process designed to enhance readability while meticulously preserving original meaning and nuance. By utilizing an automated, iterative prompt-refinement loop, the system optimizes LLM instructions to achieve high-fidelity paraphrasing that avoids the information loss typical of standard summarization. A large-scale randomized study confirms that this approach significantly improves user comprehension across complex domains like law and medicine while simultaneously reducing cognitive load for the reader. ## Automated Evaluation and Fidelity Assessment * The system moves beyond traditional metrics like Flesch-Kincaid by using a Gemini-powered 1-10 readability scale that aligns more closely with human judgment and comprehension ease. * Fidelity is maintained through a specialized process using Gemini 1.5 Pro that maps specific claims from the original source text directly to the simplified output. * This mapping method identifies and weights specific error types, such as information loss, unnecessary gains, or factual distortions, to ensure the output remains a faithful representation of the technical original. ## Iterative Prompt Optimization Loop * To overcome the limitations and speed of manual prompt engineering, the researchers implemented a feedback loop where Gemini models optimize their own instructions. * In this "LLMs optimizing LLMs" setup, Gemini 1.5 Pro analyzes the performance of simplification prompts and proposes refinements based on automated readability and fidelity scores. * The optimization process ran for 824 iterations before performance plateaued, allowing the system to autonomously discover highly effective strategies for simplifying text without sacrificing detail. ## Validating Impact through Randomized Studies * The effectiveness of the model was validated with 4,563 participants across 31 diverse text excerpts covering specialized fields like aerospace, philosophy, finance, and biology. * The study utilized a randomized complete block design to compare the original text against simplified versions, measuring outcomes through nearly 50,000 multiple-choice question responses. * Beyond accuracy, researchers measured cognitive effort using the NASA Task Load Index and tracked self-reported user confidence to ensure the simplification actually lowered the barrier to understanding. This technology provides a scalable method for democratizing access to specialist knowledge by making expert-level discourse understandable to a general audience. The system is currently available as the "Simplify" feature within the Google app for iOS, offering a practical tool for users navigating complex digital information.