Beyond Rule-Based Workflows: An Information-Flow-Orchestrated Multi-Agents Paradigm via Agent-to-Agent Communication from CORAL

Published 14 Jan 2026 in cs.AI | (2601.09883v1)

Abstract: Most existing LLM-based Multi-Agent Systems (MAS) rely on predefined workflows, where human engineers enumerate task states in advance and specify routing rules and contextual injections accordingly. Such workflow-driven designs are essentially rule-based decision trees, which suffer from two fundamental limitations: they require substantial manual effort to anticipate and encode possible task states, and they cannot exhaustively cover the state space of complex real-world tasks. To address these issues, we propose an Information-Flow-Orchestrated Multi-Agent Paradigm via Agent-to-Agent (A2A) Communication from CORAL, in which a dedicated information flow orchestrator continuously monitors task progress and dynamically coordinates other agents through the A2A toolkit using natural language, without relying on predefined workflows. We evaluate our approach on the general-purpose benchmark GAIA, using the representative workflow-based MAS OWL as the baseline while controlling for agent roles and underlying models. Under the pass@1 setting, our method achieves 63.64% accuracy, outperforming OWL's 55.15% by 8.49 percentage points with comparable token consumption. Further case-level analysis shows that our paradigm enables more flexible task monitoring and more robust handling of edge cases. Our implementation is publicly available at: https://github.com/Coral-Protocol/Beyond-Rule-Based-Workflows

Abstract PDF Upgrade to Chat

Summary

The paper introduces an orchestrator agent that replaces static workflows with adaptive, natural-language communication for managing complex tasks.
It benchmarks performance on the GAIA tasks, showing up to an 8.49% accuracy improvement under fault-tolerant, heterogeneous settings compared to rule-based systems.
The study demonstrates enhanced scalability and edge-case handling through emergent coordination patterns and real-time semantic auditing.

Information-Flow-Orchestrated Multi-Agent Systems via Natural-Language Agent-to-Agent Communication

Motivation and Limitations of Rule-Based Workflows

Prevalent LLM-based Multi-Agent Systems (MAS) for complex, open-domain tasks typically rely on rule-based, workflow-driven architectures wherein human engineers predefine discrete task states and hard-code routing and context injection logic, exemplified by systems such as OWL and MetaGPT. This paradigm, while effective for tractable, well-scoped scenarios, fundamentally suffers from two bottlenecks: (a) the combinatorial explosion of possible task states in realistic, dynamic environments renders exhaustive state enumeration infeasible, and (b) encoding handling for diverse edge cases through manual work rapidly becomes unmanageable and incomplete.

The core structure of the canonical workflow-driven OWL system is a decision-tree (Figure 1) in which each branch pre-specifies not only subtasking and agent routing but also success/failure criteria. When agents return ambiguous or incomplete outputs (for example, missing subfields for some entries in an extraction task), the decision logic predicated on brittle, coarse-grained state labeling often fails to recognize partial fulfillment, resulting in downstream propagation of corrupted state and suboptimal task outcomes.

Figure 1: Decision-making tree representation of the OWL architecture.

Information-Flow-Orchestrated Multi-Agent Coordination

The proposed paradigm eliminates the dependence on static, human-crafted workflows by introducing a dedicated information flow orchestrator agent, which continuously monitors task progress and conducts adaptive, natural-language-based coordination with all other agents through an Agent-to-Agent (A2A) communication toolkit rooted in CORAL. The orchestrator controls global task state by explicit message-passing, using primitives for (i) issuing queries and instructions, and (ii) asynchronously receiving agent responses. All inter-agent communication flows through the orchestrator, enforcing partially asymmetric topologies that maintain centralized interpretability while enabling decentralized execution.

Figure 2: Overview of the proposed Information-Flow-Orchestrated Multi-Agent Paradigm via Agent-to-Agent (A2A) Communication.

The system's message generation for both orchestration and agent replies conditions not only on static role prompts but also on the full communication history, emerging state tracking, and the live task query. Rather than advancing along pre-specified routes, the orchestrator can dynamically (re-)allocate subtasks, refine instructions, request clarification, or escalate ambiguous results, with explicit success/failure criteria tightened on the fly. This permits robust handling of partial receipts, semantic drift, tool invocation failures, and unanticipated contingencies, all expressible in natural language.

Benchmark Evaluation and Numerical Results

Experiments are conducted on the GAIA benchmark, encompassing 165 validation tasks spanning Levels 1-3 in generalist assistant domains integrating web search, multimodal reasoning, and tool use. The baseline is the open-source workflow-based OWL system, matched meticulously for both agent roles and LLM configurations to ensure experimental parity. The evaluation encompasses both homogeneous (all agents: Grok 4.1 Fast) and heterogeneous (main agents: Grok 4.1 Fast; worker agents: GPT 4.1 Mini) model deployments.

Key pass@1 accuracy results are as follows:

Homogeneous configuration: Both the information-flow-orchestrated paradigm and workflow-based OWL achieve 64.24% overall accuracy. The proposed system exhibits marginally increased token consumption on simple tasks, attributed to explicit message passing overhead.
Heterogeneous, fault-tolerant configuration: The paradigm achieves 63.64% accuracy, exceeding OWL by 8.49 percentage points (OWL: 55.15%), with comparable or superior efficiency for complex tasks (see Figure 3).

Figure 3: Cumulative distribution functions (CDFs) of token consumption for OWL and the proposed Information-Flow-Orchestrated MAS under different model configurations.

Crucially, as worker agents are weakened, OWL's heavy reliance on static workflows exacerbates the performance cliff, while the orchestrator-driven system preserves accuracy by dynamically correcting, re-routing, and re-defining subtasks in response to partial or erroneous outputs.

Analysis of Emergent Task Coordination and Edge Case Handling

Case-level analysis of execution logs uncovers several distinct, robust coordination patterns emergent from the information flow orchestration paradigm. These behaviors are not encoded in any static workflow but arise adaptively as the orchestrator interrogates, refines, or reallocates tasks based on real-time agent responses:

Direct Agent Dispatch: Bypasses decomposition for atomic tasks, reducing unnecessary overhead.
Planner-Mediated Decomposition: Engages the planner only when task structure merits explicit decomposition.
Instruction Refinement: Granularly adjusts prior instructions mid-execution rather than invoking replanning or subtask resets.
Agent Substitution: Fails over to alternate agents proactively without wholesale workflow resets.
Figure 4: Case-level analysis of emergent task coordination patterns from the information flow orchestrator.

For edge case management, three main orchestrator-initiated strategies are observed:

Dynamic Tightening of Success Criteria: Detects incompleteness and dynamically updates downstream requirements to enforce output correctness.
Real-Time Semantic Auditing: Proactively identifies and prunes semantically invalid partial outputs (e.g., date misclassifications) mid-execution.
Instruction Alignment Monitoring: Recognizes proxy/approximate answers and escalates for repeated clarification or external retrieval, preventing silent error propagation.
Figure 5: Case-level analysis of emergent edges cases handling from the information flow orchestrator.

These adaptive capabilities are inaccessible to systems governed by static, hard-coded workflows, especially as the complexity of agent interaction or uncertainty in agent output increases.

Theoretical and Practical Implications

Transferring the responsibility for workflow supervision and adaptation from human designers to an LLM-powered orchestrator agent represents a critical step toward scalable, generalist MAS with tractable robustness guarantees. In principle, the architecture presents several theoretical advantages:

The communication-driven design is compatible with arbitrary, open-ended agent topologies and variable agent granularity, enabling scaling beyond fixed decision trees.
By leveraging prompt engineering for orchestration, the system is amenable to self-supervised learning, meta-reasoning, and potential integration of reinforcement signals for further improvement.
The paradigm is inherently tool-agnostic, with all agents (including the orchestrator) abstracted as promptable LLM-tool hybrid agents.

Practically, this shift dramatically reduces the labor and fragility associated with edge case enumeration, manual routing, or rigid policy coding, accelerating deployment cycles and enabling real-world, large-scale MAS applications in unbounded environments. The system's success in preserving performance even as agent reliability degrades highlights its fault tolerance and adaptive efficiency.

Future Directions

Extending the proposed paradigm to domain-specific tasks with strong structural priors, or integrating hybrid architectures with partial domain knowledge encoding, are natural next steps. Future research could also focus on optimizing orchestrator prompts for meta-reasoning, introducing learning-driven orchestration policies, or analyzing emergent coordination behaviors as LLM capabilities and tool integration mature.

Conclusion

The information-flow-orchestrated multi-agent system via A2A communication effectively supersedes rule-based workflows for generalist LLM-based MAS, yielding superior or equivalent accuracy on the GAIA benchmark under both standard and fault-injected settings. The paradigm enables emergent, flexible task coordination and dynamic edge case handling, operationalized through explicit, natural-language orchestration rather than brittle, human-coded state machines. This agent-driven supervision approach offers a scalable and interpretable alternative for next-generation, adaptable multi-agent systems.