- The paper introduces a heterogeneous multi-agent tidying-up task that evaluates diverse collaboration strategies to detect and reallocate misplaced objects.
- It presents a novel handshake-based group communication mechanism integrated with a hierarchical decision-making framework to optimize agent roles.
- Experimental results highlight superior performance in success rates, navigation efficiency, and task decomposition, informing future multi-agent research.
Heterogeneous Embodied Multi-Agent Collaboration
The paper "Heterogeneous Embodied Multi-Agent Collaboration" investigates collaboration among heterogeneous agents within complex indoor environments to accomplish tidying-up tasks. Using varied capabilities and roles among agents, this research highlights the significance of efficient multi-agent teamwork to address object misplacement problems effectively.
The paper introduces the heterogeneous multi-agent tidying-up task, wherein diverse agents collaborate to detect and reallocate misplaced objects (Figure 1). Two scenario settings differentiate agents based on their navigation and manipulation capabilities, providing a comprehensive understanding of agent diversity in task execution. Specifically, the configuration of agents in Setting I and Setting II showcases disparities in visual perception ability, action capacity, and agent morphology, emphasizing complementary capabilities for task completion.
Figure 1: The overview of the heterogeneous multi-agent tidying-up task. Agent 1 and Agent 2 can only navigate, while Agent 3 is capable of both navigation and manipulation.
Dataset Construction
The paper constructs a benchmark dataset based on ProcTHOR-10K, generating tasks by modifying object placements to unreasonable locations (Figure 2). This dataset, divided into Single-Room and Cross-Room tasks, offers a robust evaluation platform for testing agent collaboration efficacy across distinct task complexities and environment layouts.
Figure 2: The process of generating the tidying-up task from the original scene.
Methodology
The proposed model integrates four main modules: the misplaced object detector, reasonable receptacle predictor, communication module, and hierarchical decision-making framework. Figure 3 illustrates the architectural design, emphasizing how agents utilize visual cues and commonsense reasoning to effectively communicate and execute sub-tasks. Heterogeneous agents leverage handshake-based communication strategies to coordinate actions and refine task allocation.
Figure 3: An overview of the structure of the proposed heterogeneous multi-agent collaboration model.
Communication Strategy
The paper introduces the novel Handshake-based Group Communication mechanism, facilitating efficient intra-group and inter-group exchanges of informational vectors among agents (Figure 4). The strategy minimizes communication bandwidth while maximizing collaborative task performance, establishing each agent’s role through adaptive thresholds of intra-group attention.
Figure 4: An example of the communication process.
Experimental Results
The paper reports detailed quantitative analysis comparing various communication and task execution strategies. Tables 1 and 2 summarize the results, demonstrating the superior performance of the proposed model compared to baseline methods across multiple evaluation metrics including success rate, path length, and communication efficiency score.
Qualitative Analysis
Qualitative assessments highlight successful samples from heterogeneous multi-agent setups (Figure 5). These scenarios exhibit effective task decomposition and agent-specific operations leading to successful object relocations. Contrarily, failed cases point to limitations such as incorrect location predictions and detection failures, offering insight into potential areas for improvement.

Figure 5: The successful samples in the same house in Setting I and Setting II.
Conclusion
The research proposes a comprehensive framework for heterogeneous multi-agent collaboration, validated by the heterogeneous tidying-up task and benchmark dataset. Future work can explore broader applications of the communication strategy in more complex environments with larger agent ensembles. The study contributes significant insights into efficient teamwork among diverse robotic systems, pushing the boundaries of collaborative AI deployments.