RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Interactive Environmental Learning in Physical Embodied Systems

Published 2 Aug 2025 in cs.RO and cs.AI | (2508.01415v5)

Abstract: Embodied agents face persistent challenges in real-world environments, including partial observability, limited spatial reasoning, and high-latency multi-memory integration. We present RoboMemory, a brain-inspired framework that unifies Spatial, Temporal, Episodic, and Semantic memory under a parallelized architecture for efficient long-horizon planning and interactive environmental learning. A dynamic spatial knowledge graph (KG) ensures scalable and consistent memory updates, while a closed-loop planner with a critic module supports adaptive decision-making in dynamic settings. Experiments on EmbodiedBench show that RoboMemory, built on Qwen2.5-VL-72B-Ins, improves average success rates by 25% over its baseline and exceeds the closed-source state-of-the-art (SOTA) Gemini-1.5-Pro by 3%. Real-world trials further confirm its capacity for cumulative learning, with performance improving across repeated tasks. These results highlight RoboMemory as a scalable foundation for memory-augmented embodied intelligence, bridging the gap between cognitive neuroscience and robotic autonomy.

Abstract PDF Upgrade to Chat

Summary

The paper introduces a brain-inspired framework that integrates spatial, temporal, episodic, and semantic memories to support long-term planning in robots.
It employs a Planner-Critic closed-loop planning module that dynamically adjusts multi-step action sequences, mitigating outdated plans and infinite loops.
Experimental evaluations on EB-ALFRED and EB-Habitat benchmarks demonstrate significant improvements in task success rates and real-world performance.

RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Interactive Environmental Learning in Physical Embodied Systems

Introduction

The paper "RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems" presents a novel framework designed to enhance the learning and interaction capabilities of embodied systems, specifically robots. This framework draws inspiration from cognitive neuroscience, implementing a multi-memory system that facilitates long-term planning and learning. The architecture integrates four core modules: Information Preprocessor, Lifelong Embodied Memory System, Closed-Loop Planning Module, and Low-Level Executor. These modules collectively address challenges such as continuous learning, memory latency, task correlation, and infinite loops in closed-loop planning. Evaluated on EmbodiedBench, RoboMemory improves success rates significantly compared to open-source and closed-source baselines.

Figure 1: RoboMemory architecture with working pipeline and memory mechanisms.

RoboMemory Architecture

The RoboMemory framework is structured around a unified memory paradigm which efficiently updates and retrieves information across its four submodules: Spatial, Temporal, Episodic, and Semantic. This architecture mitigates latency issues common in complex memory frameworks by enabling parallel updates.

Information Preprocessor

The Information Preprocessor uses dual modules, the Step Summarizer and the Query Generator, to transform visual observations into textual data that interfaces with RoboMemory's retrieval system. It allows fast indexing and querying, providing an agile mechanism for memory operations during each action cycle.

Lifelong Embodied Memory System

At the core of the RoboMemory architecture is the Lifelong Embodied Memory System. This system integrates:

Spatial Memory: Utilizes a dynamic Knowledge Graph (KG) to record spatial relationships. It features a retrieval-based incremental update algorithm that efficiently manages KG updates through local integration phases.
Temporal Memory: Functions as a FIFO buffer summarizing short-term interactions, facilitating long-term data consolidation.
Episodic & Semantic Memory: These modules capture task-level interactions and action-level insights, respectively, supporting long-term decision-making and task reasoning.
Figure 2: Visualization of Spatial Memory's dynamic update process.

Closed-Loop Planning

RoboMemory leverages a Planner-Critic mechanism for dynamic environment interaction. The planner generates multi-step action sequences adjusted by the critic, who ensures suitability based on the latest environmental data. This iterative evaluation prevents execution of outdated plans, circumventing potential infinite loops.

Experimental Evaluation

RoboMemory was evaluated on the EB-ALFRED and EB-Habitat benchmarks, demonstrating superior performance in task success rates compared to VLM-based and agent frameworks. The contributions of each module were validated through ablation studies, which showed the importance of long-term memory and the critic module in enhancing task adaptability.

Figure 3: Comparison of Success Rates (SR) and Goal Condition Success Rates (GC) across difficulty levels between RoboMemory and baseline methods on EB-Habitat.

Real-world Deployment

In real-world environments, RoboMemory successfully demonstrated lifelong learning capabilities with significant performance improvements in sequential task execution. The setup mirrored benchmark conditions, confirming the framework's scalability and robustness in practical scenarios.

Figure 4: Visualization of the experimental environment.

Conclusion

RoboMemory establishes a new standard for embodied systems with its brain-inspired architectural design, facilitating lifelong learning and efficient long-term planning. While the framework excels in dynamic environments, future developments could refine reasoning, integrate more robust action execution strategies, and enhance VLA-agent interaction to maximize adaptability and performance beyond simulated setups.