Effect of sparse goal rewards on one-step FB in OGBench
Investigate whether, in state-based OGBench domains with sparse goal-conditioned indicator rewards, the reward structure induces a single backward representation and thereby makes these domains challenging for the one-step forward–backward method compared to baselines that learn goal-conditioned distance functions.
References
We conjecture that the state-based OGBench domains are challenging for one-step FB because the sparse reward function (goal-conditioned indicator rewards) induces a single backward representation.
— Can We Really Learn One Representation to Optimize All Rewards?
(2602.11399 - Zheng et al., 11 Feb 2026) in Section 5.3 (Comparing One-Step FB to Prior Unsupervised RL Methods)