Optimizing LLM-based trip planning (opens in new tab)
Google Research has developed a hybrid planning system that combines Large Language Models (LLMs) with traditional optimization algorithms to solve complex trip-planning tasks. While LLMs excel at interpreting qualitative user preferences—such as a desire for "lesser-known museums"—they often struggle with hard quantitative constraints like travel logistics and fluctuating opening hours. By using an LLM to generate an initial draft and a secondary algorithm to refine it against real-world data, the system produces itineraries that are both highly personalized and logistically feasible.
The Hybrid Planning Architecture
- The process begins with a Gemini model generating an initial trip plan based on the user's natural language query, identifying specific activities and their perceived importance.
- This draft is grounded using live data, incorporating up-to-date opening hours, transit schedules, and travel times between locations.
- Search backends simultaneously retrieve alternative activities to serve as potential substitutes if the LLM's original suggestions prove logistically impossible.
Two-Stage Optimization Algorithm
- The first stage focuses on single-day scheduling, using dynamic programming and exhaustive search to find the most efficient sequence for subsets of activities.
- Each potential daily schedule is assigned a quality score based on its feasibility and how closely it aligns with the LLM's original intent.
- The second stage addresses the multi-day itinerary as a weighted variant of the "set packing problem," which ensures that activities do not overlap across different days.
- Because multi-day optimization is NP-complete, the system employs local search heuristics to swap activities between days, iteratively improving the total score until the plan converges.
Balancing Intent and Feasibility
- In practical testing, the system demonstrated a superior ability to handle nuanced requests, such as finding "lesser-known" museums in NYC, which traditional retrieval systems often fail by suggesting famous landmarks like the Met.
- The optimization layer specifically corrects geographical inefficiencies, such as the LLM suggesting a "zig-zag" route across San Francisco, by regrouping activities into logical clusters to minimize travel time.
- The system maintains the "spirit" of the LLM's creative suggestions—like visiting a specific scenic viewpoint—while ensuring the user doesn't arrive after the gates have closed.
This hybrid approach suggests that the most reliable AI planning tools do not rely on LLMs in isolation. By using LLMs as creative engines for intent interpretation and delegating logistical verification to rigid algorithmic frameworks, developers can create tools that are both imaginative and practically dependable.