Test-time plastic KV-cache for continual learning
Develop methods for transformers employing prefix tuning—such as ReasonCache—that keep the key–value (KV) cache learnable at test time to enable true continual learning without modifying the pretrained backbone parameters.
References
Importantly, prefix tuning as studied here is not a test-time learner: the prefix is trained offline and frozen at deployment. Developing methods where the KV-cache remains plastic at test time, enabling true continual learning, remains an open problem.
— ReasonCACHE: Teaching LLMs To Reason Without Weight Updates
(2602.02366 - Gupta et al., 2 Feb 2026) in Section 5 (Related Work)