Maximizing Representation-Based Transfer in RL Fine-Tuning

Develop methods that maximize transfer derived specifically from reused pretrained feature representations during fine-tuning of reinforcement learning agents, including scenarios where policy heads are re-initialized, to achieve substantial learning speedups and performance gains from representation reuse alone.

Background

To probe whether transfer comes from representations or the policy, the authors reset the last layer of the policy and critic networks before fine-tuning on Meta-World RoboticSequence, preserving only the pretrained feature representations. While performance worsens relative to full fine-tuning, there is still notable positive transfer, suggesting representation reuse contributes meaningfully.

This experiment highlights a gap: current approaches do not yet fully exploit representation reuse when policy parameters are reinitialized, motivating techniques expressly aimed at maximizing representation-based transfer.

References

Maximizing transfer from the representation remains an interesting open question.

Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem  (2402.02868 - Wołczyk et al., 2024) in Appendix, Section "Analysis of forgetting in robotic manipulation tasks", subsection "Impact of representation vs policy on transfer"