Develop practical learning dynamics for the forward–backward TD-JEPA variant under relaxed assumptions
Develop and analyze practical, off-policy learnable training dynamics for the forward–backward-in-time latent-predictive TD-JEPA variant that relies on adjoint transition kernels, enabling optimization of the theoretically sound objective under relaxed assumptions where backward sampling is required.
References
We leave the study of practical learning dynamics for this theoretically sound variant of TD-JEPA for future work.
— TD-JEPA: Latent-predictive Representations for Zero-Shot Reinforcement Learning
(2510.00739 - Bagatella et al., 1 Oct 2025) in Appendix: TD-JEPA with forward-backward-in-time sampling