Optimal Objective for Representation Learning in Image-Based World Models
Determine the optimal objective function for learning latent state representations in image-based model-based reinforcement learning world models, such as those based on the Recurrent State-Space Model (RSSM), so that the learned representations emphasize task-essential information while avoiding overfitting to irrelevant visual details.
References
While architectures like the Recurrent State-Space Model (RSSM) have achieved remarkable success (Hafner et al., 2025), a fundamental question remains open: What is the optimal objective function for learning the representation itself?
— R2-Dreamer: Redundancy-Reduced World Models without Decoders or Augmentation
(2603.18202 - Morihira et al., 18 Mar 2026) in Section 1 (Introduction)