Improving Action Controllability in Generative World Models

Determine principled techniques to increase action controllability in high-capacity generative world models for real-world robotic control, ensuring that predicted future outcomes reliably and accurately follow diverse conditioning action sequences across manipulation tasks.

Background

The paper discusses the need for world models that can serve as interactive environments for reinforcement learning in robotics. While recent generative models have substantially improved visual realism, a key control requirement is that predicted futures must adhere to the conditioning actions, which is crucial for policy improvement and reliable imagined rollouts.

The authors highlight that, despite advances, achieving strong action controllability in such models is still unresolved. Their proposed framework (RISE) addresses aspects of control through task-centric training and a compositional design, but the broader challenge of robust, general action controllability across varied tasks is explicitly identified as open.

References

Despite the improved visual realism by integrating high-capacity generative models, how to improve controllability over various actions remains an open problem.

RISE: Self-Improving Robot Policy with Compositional World Model  (2602.11075 - Yang et al., 11 Feb 2026) in Section 1 Introduction