Dynamic or problem-specific certainty thresholds for CGR
Determine whether dynamically chosen or problem-specific certainty thresholds for early stopping in Certainty-Guided Reasoning (CGR) improve accuracy and computational efficiency compared to a fixed threshold (e.g., 0.97), including calibration strategies such as online adaptation and the use of external signals like input complexity.
References
Several promising directions remain open for exploration. First, although we used a fixed certainty threshold across all problems, dynamic or problem-specific thresholding may yield better results, especially if calibrated online or using external signals like input complexity.
— Certainty-Guided Reasoning in Large Language Models: A Dynamic Thinking Budget Approach
(2509.07820 - Nogueira et al., 9 Sep 2025) in Conclusions and Future Work