- The paper presents a comprehensive survey of agentic LLMs, detailing advancements in chain-of-thought, self-reflection, and tool integration.
- It categorizes models into reasoning, acting, and interacting domains, highlighting their interdependency and real-world applications.
- The survey outlines future research directions to address data quality, hallucination, scalable agent behavior, and safety concerns.
Agentic LLMs: A Survey of Reasoning, Acting, and Interacting Models
This paper provides a comprehensive survey of the emerging field of agentic LLMs, defining them as LLMs that can reason, act, and interact (2503.23037). It organizes the literature based on these three core capabilities and discusses applications and future research directions.
Taxonomy of Agentic LLMs
The paper categorizes agentic LLMs into three main areas, reflecting their ability to reason, act, and interact:
- Reasoning: This category focuses on enhancing LLMs' decision-making through improved reasoning, reflection, and information retrieval. Sub-areas include multi-step reasoning, self-reflection, and retrieval augmentation.
- Acting: This category deals with LLMs that can perform actions in the real world, often as assistants. It covers world models, robot/tool integration, and various applications of LLM assistants.
- Interacting: This category explores LLMs in multi-agent systems, focusing on collaborative task-solving and simulating social behaviors. It includes social capabilities, role-based interaction, and open-ended societies.
The authors highlight that these categories are complementary, with advances in one area benefiting others. For instance, retrieval augmentation supports tool use, self-reflection enhances multi-agent collaboration, and reasoning improves all categories.
Reasoning in Agentic LLMs
The survey explores techniques for improving LLM reasoning capabilities, including:
- Chain of Thought (CoT): Prompting LLMs to generate intermediate reasoning steps to solve complex problems. Techniques like "Let's think step by step" have shown significant improvements in performance (2503.23037).
- Self-Consistency: Sampling diverse reasoning paths and selecting the most consistent answer through majority voting to mitigate hallucination (2503.23037).
- Interpreter and Debugger: Using formal languages like Python to reformulate problems, allowing specialized systems to solve them. Debuggers provide feedback on generated code, enhancing code generation (2503.23037).
- Search Tree: Employing external control algorithms to explore a tree of reasoning steps, enabling backtracking and alternative solutions (2503.23037).
- Self-Reflection: LLMs assess and refine their own predictions through prompt-improvement loops, using external memory to store state information (2503.23037).
- Retrieval Augmentation: Augmenting LLMs with external knowledge bases for timely and specialized information retrieval at inference time (2503.23037).
Acting in the World
This section covers LLMs that interact with the world through action models, robots, and tools:
- World Models: Learning surrogate models of the environment to enable sample-efficient training of policies (2503.23037).
- Vision-Language-Action (VLA) Models: Training models on robotic sequences to perform actions in visual scenes based on language prompts. VLA models achieve impressive zero-shot results in complex tasks (2503.23037).
- Robot Planning: Grounding LLMs in robotic affordances by integrating knowledge of the environment and robot capabilities. Techniques like SayCan and Inner Monologue enhance robot planning and interaction (2503.23037).
- Action Tools: Integrating LLMs with external tools through APIs, enabling them to perform tasks like calling search engines or using specialized services. Frameworks like ToolBench and EasyTool facilitate tool calling (2503.23037).
- Assistants: Developing virtual assistants for various applications, including conversational assistance, shopping, flight operations, medical support, and financial trading (2503.23037).
Interacting in Multi-Agent Systems
The survey discusses LLMs in multi-agent simulations, focusing on:
- Social Capabilities: Examining social and interactive abilities in LLMs, such as conversation, etiquette, empathy, strategic behavior, and theory of mind. Benchmarks like GTBench and EgoSocialArena are used to evaluate these capabilities (2503.23037).
- Role-Based Interaction: Assigning LLMs distinct roles to perform tasks in pairs or teams, fostering cooperative or adversarial interactions. Frameworks like CAMEL and Multi-Agent Debate (MAD) are used to study role-based interactions (2503.23037).
- Open-Ended Societies: Simulating large-scale agent societies to study emergent behaviors, social dynamics, and norms. Platforms like Generative Agents and OASIS are used to model social interactions at scale (2503.23037).
Research Agenda
The authors propose a research agenda for agentic LLMs, focusing on:
- Training Data: Finetuning LLMs with inference-time reasoning data and exploring convergent reinforcement learning techniques (2503.23037).
- Hallucination: Addressing hallucination through self-verification, mechanistic interpretability, and open-world models (2503.23037).
- Agent Behavior: Scaling simulation infrastructure, distilling reasoning into small models, and modeling agent and human behavior (2503.23037).
- Self-Reflection: Developing in-model self-reflection mechanisms, exploring metacognition and personality, and automating scientific discovery (2503.23037).
- Safety: Addressing responsibility and liability issues, ensuring privacy and fairness, and expanding application areas for assistants (2503.23037).
Conclusion
Agentic LLMs are a rapidly evolving field with significant potential for various applications. The survey highlights the importance of reasoning, acting, and interacting capabilities and provides a roadmap for future research directions. Key areas for development include generating high-quality training data, mitigating hallucinations, scaling agent behavior, improving self-reflection, and ensuring the safety and ethical use of agentic LLMs.