Conjecture on where procedural knowledge resides for ReasonCache versus SFT
Ascertain whether ReasonCache stores procedural knowledge directly in the learned key–value cache—thereby eliminating the need to externalize it into generated token sequences—whereas supervised fine-tuning externalizes weight-encoded procedural knowledge during generation, causing unnecessary verbosity.
References
Our conjecture is that in contrast, ReasonCache stores procedural knowledge directly in the KV cache, eliminating the need for explicit externalization.
— ReasonCACHE: Teaching LLMs To Reason Without Weight Updates
(2602.02366 - Gupta et al., 2 Feb 2026) in Figure caption for Fig. ‘verbosity’, Section 3.3 (Inference Efficiency)