Determine whether elicited probabilities reflect LLMs’ true subjective beliefs

Determine whether probabilities elicited from large language models via natural-language prompting correspond to the models’ true subjective belief states—i.e., subjective probabilities that drive internal decision computations—or whether the elicited probabilities are superficial linguistic outputs that do not reflect the computations governing choices.

Background

The paper investigates whether LLMs act as rational utility maximizers with coherent beliefs in decision-making tasks, particularly medical diagnosis. A core concern is whether verbalized probability estimates genuinely reflect a model’s internal subjective beliefs that influence actions, or whether such estimates are merely surface-level outputs disconnected from decision policies.

To address this, the authors propose a decision-theoretic framework and empirical tests that link elicited beliefs to observed choices, aiming to falsify inconsistencies between the two. This open question motivates the need for methods that can validate whether elicited probabilities serve as mechanistically meaningful beliefs for high-stakes decisions.

References

However, it is unclear whether stated probabilities reflect the model's 'true' beliefs: an elicited probability could track an internal epistemic state, or it could be a superficial linguistic output only weakly linked to the computations that drive choices (Pal et al., 2025; Wang et al., 2024a; Liu et al., 2024a).

Do LLMs Act Like Rational Agents? Measuring Belief Coherence in Probabilistic Decision Making  (2602.06286 - Yamin et al., 6 Feb 2026) in Section 1 (Introduction)