Belief-state simplex geometries in natural-language LLMs
Establish whether large pretrained language models trained on naturalistic text develop internal simplex-shaped geometric representations whose barycentric coordinates encode probability distributions over discrete latent states, analogous to the belief-state simplices observed in transformers trained on hidden Markov model–generated sequences.
References
Whether LLMs trained on naturalistic text develop analogous geometric representations remains an open question.
— Finding Belief Geometries with Sparse Autoencoders
(2604.02685 - Levinson, 3 Apr 2026) in Abstract