Hilberg’s conjecture on vanishing code length at infinite context
Determine whether the mean next-character code length L(N) for English text, defined as the expected code length for predicting the next token given the previous N characters, tends to zero as N approaches infinity, as conjectured by Hilberg.
References
This is consistent with the conjecture that $L(N \rightarrow \infty)$ might actually vanish , though the decay is much slower than one would estimate from data at smaller $N$.
— Large language models and the entropy of English
(2512.24969 - Scheibner et al., 31 Dec 2025) in Main text, paragraph discussing Fig. 1 (paragraph beginning “Three of the four models agree almost perfectly…”)