Principled Use of Offline Real-World Data with Simulated Rollouts
Characterize the effectiveness and governing principles of integrating offline real-world datasets with simulated rollouts from a learned world model when training vision–language–action robot policies, and determine the optimal mixing ratio between simulated imagination data and real-world experience to maximize performance while preventing catastrophic forgetting.
References
However, the optimal ratio between simulated rollouts and real-world experience requires further parameter tuning. Understanding the effectiveness and principles of these offline data represents an open problem.
— RISE: Self-Improving Robot Policy with Compositional World Model
(2602.11075 - Yang et al., 11 Feb 2026) in Section Limitations and future work — The Simulated–Real Data Balance