Extend analysis of flow-matching critics beyond linear models

Extend the theoretical analysis of flow-matching critics used for temporal-difference learning from the linear setting to nonlinear function approximators.

Background

The paper develops a theoretical account of why flow-matching critics improve temporal-difference learning, introducing test-time recovery and plastic feature learning, and provides supporting analysis in a linear setting. While empirical results cover complex environments, the formal analysis is restricted to linear models.

The authors explicitly note that moving from linear models to nonlinear function approximation is not yet established and highlight this gap as an important open direction for future theory.

References

From a theoretical perspective, extending our analysis beyond the linear setting to nonlinear function approximation remains an important open problem.

What Does Flow Matching Bring To TD Learning?  (2603.04333 - Agrawalla et al., 4 Mar 2026) in Section 6, Discussion and Perspectives on Future Work