Reliability of internal-state-based measures of factual encoding
Determine the extent to which internal-state-based methods for measuring factual encoding in large language models reliably capture whether a fact is truly stored in the model’s parameters, as opposed to merely correlating with behavioral reproduction under training-like contexts.
References
Existing approaches to measuring encoding often rely on access to internal states, a requirement that does not align with our focus on evaluating frontier LLMs. Moreover, the extent to which these methods reliably capture whether a model truly encodes a fact remains an open question \citep{HaseBKG23, Ma2024bird, Huang2024Demys, WeiYWMZ0024, ChenC00025, Haller2025brittle}.
— Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality
(2602.14080 - Calderon et al., 15 Feb 2026) in Section 2.1, Operationalizing Encoding and Knowledge (Encoding)