automated-evaluation | Techlist.io

meta Apr 21, 2026

Modernizing the Facebook Groups Search to Unlock the Power of Community Knowledge (opens in new tab)

We’ve fundamentally transformed Facebook Groups Search to help people more reliably discover, sort through, and validate community content that’s most relevant to them. We’ve adopted a new hybrid retrieval architecture and implemented automated model-based evaluation to address…

automated-evaluation database-design machine-learning nlp+4

google May 5, 2025

Making complex text understandable: Minimally-lossy text simplification with Gemini (opens in new tab)

Google Research has introduced a novel system using Gemini models to perform minimally-lossy text simplification, a process designed to enhance readability while meticulously preserving original meaning and nuance. By utilizing an automated, iterative prompt-refinement loop, the system optimizes LLM instructions to achieve high-fidelity paraphrasing that avoids the information loss typical of standard summarization. A large-scale randomized study confirms that this approach significantly improves user comprehension across complex domains like law and medicine while simultaneously reducing cognitive load for the reader. ## Automated Evaluation and Fidelity Assessment * The system moves beyond traditional metrics like Flesch-Kincaid by using a Gemini-powered 1-10 readability scale that aligns more closely with human judgment and comprehension ease. * Fidelity is maintained through a specialized process using Gemini 1.5 Pro that maps specific claims from the original source text directly to the simplified output. * This mapping method identifies and weights specific error types, such as information loss, unnecessary gains, or factual distortions, to ensure the output remains a faithful representation of the technical original. ## Iterative Prompt Optimization Loop * To overcome the limitations and speed of manual prompt engineering, the researchers implemented a feedback loop where Gemini models optimize their own instructions. * In this "LLMs optimizing LLMs" setup, Gemini 1.5 Pro analyzes the performance of simplification prompts and proposes refinements based on automated readability and fidelity scores. * The optimization process ran for 824 iterations before performance plateaued, allowing the system to autonomously discover highly effective strategies for simplifying text without sacrificing detail. ## Validating Impact through Randomized Studies * The effectiveness of the model was validated with 4,563 participants across 31 diverse text excerpts covering specialized fields like aerospace, philosophy, finance, and biology. * The study utilized a randomized complete block design to compare the original text against simplified versions, measuring outcomes through nearly 50,000 multiple-choice question responses. * Beyond accuracy, researchers measured cognitive effort using the NASA Task Load Index and tracked self-reported user confidence to ensure the simplification actually lowered the barrier to understanding. This technology provides a scalable method for democratizing access to specialist knowledge by making expert-level discourse understandable to a general audience. The system is currently available as the "Simplify" feature within the Google app for iOS, offering a practical tool for users navigating complex digital information.

automated-evaluation ai llm nlp+3