Recovering the policy competence radius without policy evaluation
Determine whether the policy competence radius R—defined as the effective radius of states around a given state for which a learned goal-conditioned value function provides a clear learning signal for extracting a goal-conditioned policy—can be recovered or effectively approximated in practice without evaluating an extracted policy.
References
This quantity is not known apriori because we do not have access to the true value approximation error, and it is not clear if it can be recovered or effectively approximated in practice without evaluating an extracted policy.
— Hierarchical Entity-centric Reinforcement Learning with Factored Subgoal Diffusion
(2602.02722 - Haramati et al., 2 Feb 2026) in Appendix, Section "The Benefits of Decoupling Training Goal Distributions Across the Hierarchy"