Zero-shot adaptation performance of non-hierarchical baselines
Ascertain how standard model-free reinforcement learning baselines such as Rainbow DQN and goal-conditioned DQN perform zero-shot adaptation to previously mastered goals when the agent starts from a novel initial state, in the absence of hierarchical options and an abstract world model for planning, and rigorously characterize their capabilities and limitations under these conditions.
References
It is unclear, on the other hand, how other baselines would perform zero-shot adaptation to novel situations without hierarchical options to compose sub-options and an abstract world model to plan on.
— Joint Learning of Hierarchical Neural Options and Abstract World Model
(2602.02799 - Piriyakulkij et al., 2 Feb 2026) in Experimental Results, Zero-shot generalization to novel situations