Source of Accuracy Gains from Parallel Tool Calling

Determine the primary factors responsible for the observed accuracy improvements when using parallel tool calling—issuing multiple tool invocations within a single reasoning step—in deep research agents that perform multi-step web-based information seeking and reasoning, relative to single tool calling.

Background

The paper introduces the Wide and Deep (W{content}D) research agent to study scaling along both depth (more iterative steps) and width (parallel tool calling). Empirically, the authors show that parallel tool calling improves not only efficiency (fewer turns, lower latency and cost) but also accuracy on deep research benchmarks such as BrowseComp, HLE, and GAIA.

While the efficiency benefits are straightforward, the authors explicitly note uncertainty about why accuracy improves under parallel tool calling. They subsequently provide qualitative case studies identifying patterns—broader source exploration, redundancy-based verification against unreliable tools, and query decomposition—but do not formalize or definitively establish the underlying mechanisms, leaving the core source of the accuracy gains as an unresolved question.

References

While the efficiency improvement is easier to understand due to the decreased number of iterations and reduced reasoning, the source of the effectiveness (i.e., the gain in performance) remains unclear.

W&D:Scaling Parallel Tool Calling for Efficient Deep Research Agents  (2602.07359 - Lin et al., 7 Feb 2026) in Section 4 (Why Parallel Tool Calling Improves Accuracy), opening paragraph