Refining certainty metrics beyond token-level probabilities
Construct and assess alternative certainty metrics for Certainty-Guided Reasoning (CGR) beyond token-level min-of-max probabilities, such as entropy-based measures, variance across sampled reasoning trajectories, or agreement with external verifiers, and determine whether they yield improved stopping decisions and predictive reliability.
References
Several promising directions remain open for exploration. Third, the certainty metric could be refined beyond token-level probabilities—for example, incorporating entropy, variance across sampled trajectories, or alignment with external verifiers.
— Certainty-Guided Reasoning in Large Language Models: A Dynamic Thinking Budget Approach
(2509.07820 - Nogueira et al., 9 Sep 2025) in Conclusions and Future Work