TRIP-PAL: Travel Planning with Guarantees by Combining Large Language Models and Automated Planners

Published 14 Jun 2024 in cs.AI | (2406.10196v1)

Abstract: Travel planning is a complex task that involves generating a sequence of actions related to visiting places subject to constraints and maximizing some user satisfaction criteria. Traditional approaches rely on problem formulation in a given formal language, extracting relevant travel information from web sources, and use an adequate problem solver to generate a valid solution. As an alternative, recent LLM based approaches directly output plans from user requests using language. Although LLMs possess extensive travel domain knowledge and provide high-level information like points of interest and potential routes, current state-of-the-art models often generate plans that lack coherence, fail to satisfy constraints fully, and do not guarantee the generation of high-quality solutions. We propose TRIP-PAL, a hybrid method that combines the strengths of LLMs and automated planners, where (i) LLMs get and translate travel information and user information into data structures that can be fed into planners; and (ii) automated planners generate travel plans that guarantee constraint satisfaction and optimize for users' utility. Our experiments across various travel scenarios show that TRIP-PAL outperforms an LLM when generating travel plans.

Abstract PDF Upgrade to Chat

Citations (2)

View on Semantic Scholar

Summary

The paper presents a hybrid framework that uses LLMs for semantic extraction and symbolic planners for generating constraint-compliant travel itineraries.
It demonstrates that symbolic planning reliably produces valid and utility-maximizing solutions, outperforming LLM-only approaches in both feasibility and optimality.
Experimental evaluations across 100 itineraries highlight the method’s scalability and its potential for customizable, robust travel planning.

Hybrid LLM-Aided Symbolic Planning for Robust Travel Itinerary Generation

Introduction and Motivation

The travel planning problem incorporates complex real-world constraints over points of interest (POIs), user utility, and time, constituting a canonical oversubscription planning setting where not all desirable goals can be jointly achieved. Classical symbolic planners, when supplied with formal domain models, provide guarantees over solution correctness and constraint compliance, but their effectiveness is limited by the difficulty and expertise required for accurate problem formalization. Recent advances in LLMs have demonstrated powerful abilities in extracting, recombining, and abstracting information from large text corpora, which makes them effective at surfacing POIs and nominally assembling trip itineraries from user-specified queries. However, LLM-based itineraries frequently lack formal constraint-checking, leading to invalid, suboptimal, or incoherent solutions.

This work introduces TRIP-PAL: a robust, hybridized framework integrating LLMs for semantic extraction and preprocessing with symbolic automated planners for plan generation and validation (2406.10196). This pipeline takes user and destination input, leverages an LLM to mine and curate structured POI data (utilities, durations, travel pairwise times), and then constructs a formal planning problem instance that is solved via state-of-the-art cost-optimal symbolic planning, yielding itineraries with explicit feasibility and optimality guarantees.

Figure 1: High-level system diagram: the LLM extracts and structures tourism data for planning; the symbolic planner generates constraint-compliant itineraries maximizing user utility.

Approach and Methodological Framework

TRIP-PAL divides the travel planning process into modularized stages:

Information Extraction: For a specified destination, the LLM generates a candidate set of $N$ POIs, queries their popularity (utility), and estimates visit durations. For the travel-time structure, the system either queries an external API or assigns random pairwise travel-time samples.
Oversubscription Planning Formulation: The above data is compiled into soft-goal PDDL, encoding visiting each POI as a soft goal, with associated utility, and formal action descriptions for moving and visiting actions. Classic oversubscription planning (OSP) is realized by objectifying time as 15-min discrete slots and constructing appropriate skip-goal actions with high penalties for unachieved soft goals [keyder2009soft].
Automated Planning: The formalized problem is solved using an optimal PDDL planner (Fast Downward with seq-opt-lmcut). The result is a valid, utility-maximizing plan that satisfies all temporal and action-availability constraints.
Baseline LLM Planning: For comparison, the same curated information is provided as context to the LLM, which is prompted to select and schedule POIs into a day itinerary, optimizing for utility and remaining within time constraints.

Empirical Evaluation and Results

Experiments encompass 100 one-day itinerary planning tasks across 20 cities (10 popular, 10 less popular), with randomization over POI selection, durations, and travel-time matrices. Metrics include plan validity, utility, and runtime.

Key findings:

Validity: Symbolic planners always output valid itineraries under input constraints; LLMs generate invalid plans in 86% of cases—violating visit times, travel times, or the global time budget, with deviations as high as recommending an 8x shorter visit than the requirement.
Optimality: Even "valid" LLM solutions are, in expectation, $1.19\times$ less utility-optimal than planner solutions. Over the whole benchmark, the planner outperforms the LLM in $100\%$ of tasks when strict validity checks are applied, and in $79\%$ even when leniency is given for minor constraint violations.
Scalability: Planner runtime is competitive with LLM-only approaches for small POI sets ( $N<10$ ). For larger scenarios ( $N>12$ ), planner runtime increases, but optimality and feasibility are preserved where LLMs degrade dramatically in both.
POI Coverage: Symbolic planners identify plans with higher average utility per POI and fit more POIs when feasible within the horizon, contrasting LLM plans, which are susceptible to myopic or infeasible selections.

Claims and Comparative Analysis

The paper establishes three principal contributions:

Contradictory Evidence: Contrary to any suggestion that LLMs can plan effectively in sequential decision settings when they possess relevant domain knowledge, experiments robustly show that even the strongest LLMs produce infeasible or suboptimal solutions for travel planning under constraints, refuting claims of LLM sufficient autonomy for planning.
Guarantees via Hybridization: The hybrid LLM–planner pipeline uniquely ensures both utility optimization and constraint satisfaction, outperforming pure LLM baselines and earlier "verifier-LLM" and LLM-SMT approaches that lack either formal constraint integration or oversubscription realism.
Scalability and Practicality: The described system formalizes and automates the entire information extraction and planning pipeline, providing an end-to-end travel planning solution that can flexibly incorporate user-specific preferences and extend to continuous time and multiday horizon at the cost of increased solve time.

Practical and Theoretical Implications

From a practical standpoint, this work demonstrates that LLMs should be leveraged as information extraction and modeling primitives within hybrid AI pipelines, not as end-to-end decision makers in critical domains requiring formal guarantees. The system’s modularity allows it to be readily adapted for user customization, incorporation of dynamic web queries, or application to non-tourism sequential decision scenarios (e.g., event scheduling, multi-resource allocation).

Theoretically, the results underscore intrinsic LLM limitations for logical reasoning and constraint satisfaction in multi-step domains [rao_cannot_plan, LLM_modul_plan_Rao], supporting the thesis that model-free large neural systems lack a compositional planning substrate comparable to symbolic reasoning engines. The operationalization of oversubscription planning via LLM-extracted instances sets a new direction for research in the automatic acquisition of planning models from naturalistic, unstructured user queries—bridging knowledge extraction with robust, optimal deliberative plan synthesis.

Future Directions

Open research questions remain regarding scalability to real-world, multi-day, or collaborative travel environments, efficient incorporation of real-time data streams (e.g., traffic, POI open times), modeling of explicit user utility functions in natural language, and extension to other domains requiring guaranteed optimal action sequences. An avenue of interest is the use of LLMs to instantiate richer planning models, potentially including mixed-integer programming or temporal HTN structures, and the dynamic feedback loop between planner output and LLM-driven user dialogue for explanation or iterative refinement.

Conclusion

TRIP-PAL establishes a robust, hybridized paradigm for travel planning that exploits LLMs as knowledge extraction oracles but delegates action sequencing, constraint satisfaction, and utility optimization to automated symbolic planners. This design decisively outperforms LLM-only planning approaches in validity and optimality, even in domains where LLMs possess extensive prior knowledge. The work provides strong evidence for hybrid symbolic–statistical alignments over purely end-to-end neural reasoning architectures in constrained, utility-oriented sequential decision tasks, and sets the agenda for further research on data-driven planning instance generation and scalable user-aligned AI decision support systems (2406.10196).