- The paper presents a Plan-and-Act framework that leverages synthetic data and dynamic replanning to improve planning in LLM-based agents.
- It employs a dual-module system where a Planner generates high-level strategies and an Executor translates these plans into adaptive actions.
- Evaluation on the WebArena-Lite benchmark achieved a 53.94% success rate, highlighting its superior performance over traditional methods.
Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks
Introduction
The "Plan-and-Act" paper introduces a framework aimed at enhancing the planning capabilities of LLM-based agents for complex tasks. The framework encapsulates high-level planning and execution into distinct components— the Planner and Executor—specifically designed to address challenges in multi-step, long-horizon tasks. Recognizing that LLMs struggle with precise plan generation due to a lack of inherent training in explicit planning, the authors propose synthetic data generation to provide extensive examples for training the Planner.
Figure 1: Plan-and-Act System Diagram. The Planner processes the user query to generate a high-level plan for the Executor to implement.
System Architecture
The Plan-and-Act system segregates the responsibilities of task planning and execution into two modules:
- Planner: This module formulates structured, high-level strategies to accomplish the specified user tasks. It benefits from a synthetic data generation methodology that accurately assigns ground-truth task annotations, improving plan generation.
- Executor: This component translates the plans generated by the Planner into executable actions within the environment, adapting to the dynamic nature of task variables and environment changes using real-time feedback.
The novel feature of the framework is the ability to invoke Dynamic Replanning, which regenerates plans as the environment changes, enabling adaptability and resilience to unforeseen shifts or failures in initial task execution.
Synthetic Data Generation
Action Trajectory Generation
To alleviate the constraints imposed by limited real-world action trajectory data, a scalable synthetic data pipeline is used. This includes generating potential user queries and collecting trajectories rated by an outcome-supervised reward model. The generated data are filtered to ensure that only successful trajectories are utilized in training.
Figure 2: Synthetic Data Generation Pipeline. Shows stages involved in generating and annotating data for training.
Grounded Plan Generation
This process involves reverse-engineering actions from executed trajectories to synthesize structured plans grounded in the actual task environment. It ensures the proposed plans are executable and relevant to the context of execution.
Plan Expansion and Augmentation
The framework extends the planner’s dataset through synthetic augmentation techniques, employing the context-specific patterns identified during data creation to generate additional data samples. This expansion supplements the original datasets, overcoming data scarcity by increasing both volume and diversity.
Results and Evaluation
The Plan-and-Act framework was evaluated on the WebArena-Lite benchmark, achieving a marked improvement in task success rates relative to existing methods such as WebRL, with efficacy demonstrated through a success rate of 53.94%. The results underscore the efficacy of synthetic data strategies and dynamic planning in enhancing agent performance for complex, long-range tasks.
Conclusion
Plan-and-Act effectively separates planning from execution, enhancing performance in LLM agents by employing synthetic data for training nuanced strategies for dynamic, long-horizon tasks. Its modular architecture highlights the potential of scalable data generation in overcoming the intrinsic limitations faced by LLMs in detailed plan generation. Future work aims to integrate memory-enhanced reasoning and multi-modal inputs to further bolster AI capabilities in diverse digital environments.