Agentic Reasoning: A Streamlined Framework for Enhancing LLM Reasoning with Agentic Tools

Published 7 Feb 2025 in cs.AI and cs.CL | (2502.04644v2)

Abstract: We introduce Agentic Reasoning, a framework that enhances LLM reasoning by integrating external tool-using agents. Agentic Reasoning dynamically leverages web search, code execution, and structured memory to address complex problems requiring deep research. A key innovation in our framework is the Mind-Map agent, which constructs a structured knowledge graph to store reasoning context and track logical relationships, ensuring coherence in long reasoning chains with extensive tool usage. Additionally, we conduct a comprehensive exploration of the Web-Search agent, leading to a highly effective search mechanism that surpasses all prior approaches. When deployed on DeepSeek-R1, our method achieves a new state-of-the-art (SOTA) among public models and delivers performance comparable to OpenAI Deep Research, the leading proprietary model in this domain. Extensive ablation studies validate the optimal selection of agentic tools and confirm the effectiveness of our Mind-Map and Web-Search agents in enhancing LLM reasoning. The code is at: https://github.com/theworldofagents/Agentic-Reasoning

Abstract PDF Upgrade to Chat

Summary

The paper introduces Agentic Reasoning, a framework that integrates web-search, coding, and Mind-Map agents to enhance LLM reasoning in complex tasks.
It demonstrates state-of-the-art performance on benchmarks like GPQA and GAIA, showcasing improved logical consistency and problem-solving effectiveness.
Experimental evaluations reveal its applicability in domains such as medical decision-making and strategic reasoning, highlighting its practical impact.

Agentic Reasoning: A Streamlined Framework for Enhancing LLM Reasoning with Agentic Tools

Introduction

The framework of "Agentic Reasoning" is introduced to enhance the reasoning capabilities of LLMs by integrating external tool-using agents. This approach leverages Web Search, code execution, and structured memory to tackle complex questions that necessitate extensive research and reasoning depth. A novel aspect of this framework is the Mind-Map agent, which constructs a structured knowledge graph that preserves and tracks logical relationships, improving coherence in extended reasoning chains where multiple tools are incorporated.

Methodology

Agentic Reasoning Pipeline

The pipeline for Agentic Reasoning integrates external agents into the LLM's reasoning process to enhance its problem-solving capabilities. The LLM can dynamically determine when to invoke external agents, such as the Web-Search and Code agents, as well as the Mind-Map agent, for structured memory storage.

When a specific need arises in the reasoning process, the LLM embeds tasks via special tokens, signaling the necessity for tool-based solutions. For example, web-search tokens prompt external information retrieval, and coding tokens may initiate computational tasks. Each token is accompanied by a generated query message for the respective agent.

Figure 1: The overall workflow of Agentic Reasoning. Given a question, the reasoning LLM can invoke the Web-Search agent to retrieve external information, the Coding agent to perform quantitative computations, and the Mind-Map agent to structurally memorize the reasoning context, to provide a comprehensive solution.

Mind-Map Agent

The Mind-Map agent is designed to manage real-time reasoning contexts through the transformation of reasoning chains into structured knowledge graphs. It is distinguished by its ability to cluster context into groups and summarize them, using community clustering and LLM-based summarization techniques. This organized memory structure enables the model to maintain coherence over long reasoning sequences and serve as a queryable external memory.

Web-Search Agent

The Web-Search agent enhances the LLM by breaking down queries, retrieving relevant information, and ranking web pages. This agent iteratively refines queries and processes relevance feedback, allowing effective integration of external knowledge into the reasoning process.

Coding Agent

The Coding agent manages coding tasks by generating code, executing it, and integrating the outputs back into the reasoning task. This separation allows the LLM to maintain focus on reasoning without direct coding disruptions, promoting task efficiency and coherence.

Experimental Evaluation

Solving Expert-Level Problems

Agentic Reasoning demonstrated superior performance in benchmark tasks, achieving new state-of-the-art results on the GPQA dataset by effectively utilizing integrated reasoning tools.

The method also excelled on the GAIA benchmark, outperforming many proprietary models. A case study in medical decision-making illustrated its capability in automating complex analytical tasks like determining optimal $FiO_2$ and PEEP for clinical decisions.

Figure 2: Case study on a complex medical decision-making problem.

Deep Research Tasks

Through extensive evaluation on article generation based on the FreshWiki dataset, Agentic Reasoning produced superior performance against established search-enhanced reasoning models, demonstrating the efficacy of structured tool integration in long-form content generation.

Analysis

Agentic Reasoning's success is largely attributed to its adaptive integration of web-search and memory tools, with the Mind-Map agent playing a crucial role in maintaining logical consistency over extended reasoning. The effectiveness was validated in strategic game assessments, such as Werewolf, showcasing enhanced deductive reasoning.

Figure 3: The ablation study examines the impact of different tools in reasoning. Green ones represent external toolboxes, red ones are combinations of our proposed tools. The blue line is the overall performance of the base reasoning model.

Conclusion

Agentic Reasoning provides a highly effective framework for enhancing the reasoning abilities of LLMs, especially in complex problem-solving scenarios. By successfully integrating structured memory and external tools, the approach not only achieves state-of-the-art results but also opens up new avenues for more nuanced task executions. Future work will explore additional task-specific tool integrations to further advance reasoning capabilities in complex, dynamic environments.