Scaling abstract world-model representations across domains and modalities
Develop scalable methods for learning compact, abstract world-model representations that discard irrelevant details and generalize across arbitrary domains and modalities, rather than relying on approaches that preserve complete observation information via reconstructable latent representations or raw data.
References
Although psychology and cognitive science suggest that human mental models rely on compact representations that discard irrelevant details, how to scale approaches capable of learning such abstract representations to arbitrary domains and modalities is still unclear.
— Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models
(2601.19834 - Wu et al., 27 Jan 2026) in Section 2: Related Work — World models