ConfQA: Answer Only If You Are Confident

Published 8 Jun 2025 in cs.CL | (2506.07309v1)

Abstract: Can we teach LLMs to refrain from hallucinating factual statements? In this paper we present a fine-tuning strategy that we call ConfQA, which can reduce hallucination rate from 20-40% to under 5% across multiple factuality benchmarks. The core idea is simple: when the LLM answers a question correctly, it is trained to continue with the answer; otherwise, it is trained to admit "I am unsure". But there are two key factors that make the training highly effective. First, we introduce a dampening prompt "answer only if you are confident" to explicitly guide the behavior, without which hallucination remains high as 15%-25%. Second, we leverage simple factual statements, specifically attribute values from knowledge graphs, to help LLMs calibrate the confidence, resulting in robust generalization across domains and question types. Building on this insight, we propose the Dual Neural Knowledge framework, which seamlessly select between internally parameterized neural knowledge and externally recorded symbolic knowledge based on ConfQA's confidence. The framework enables potential accuracy gains to beyond 95%, while reducing unnecessary external retrievals by over 30%.

Abstract PDF Upgrade to Chat

Summary

The paper introduces a fine-tuning strategy that trains LLMs to answer only when confident, reducing hallucination rates to under 5%.
The methodology employs a dampener prompt and knowledge graphs to calibrate confidence and improve factual accuracy.
The Dual Neural Knowledge Framework dynamically selects between neural and external information, boosting accuracy and cutting retrieval needs by 30%.

Overview of "ConfQA: Answer Only If You Are Confident"

The paper "ConfQA: Answer Only If You Are Confident" addresses a significant challenge in the operation of LLMs: the tendency to generate hallucinatory or improbable factual statements. By introducing a fine-tuning strategy, the authors propose a method that achieves a substantial reduction in hallucination rates, from 20-40% to under 5% across multiple factuality benchmarks. The approach, called ConfQA, is a sophisticated mechanism that trains LLMs to either provide factual responses or acknowledge uncertainty, a necessary calibration to ensure reliable output from AI systems.

Key Contributions

Training Strategy and Methodology: The paper lays out a unique training regime where the LLM, when confident of a correct answer, continues to the answer, otherwise, it is trained to say "I am unsure". Two critical components are identified in this process:
- Dampener Prompt: The phrase "answer only if you are confident" acts as an explicit guide and is essential for reducing hallucination rates significantly.
- Leveraging Knowledge Graphs: The use of simple factual statements derived from knowledge graphs assists the LLMs in calibrating their confidence effectively.
Dual Neural Knowledge Framework: This novel framework intelligently toggles between internally parameterized neural knowledge and externally recorded symbolic knowledge based on ConfQA's confidence. This dynamic selection process not only heightens accuracy beyond 95% but also optimizes external retrieval needs, cutting unnecessary lookups by 30%.

Experimental Results and Analysis

The paper's experimental examination of LLMs' confidence calibration indicates overconfidence in self-reported assessment of facts. LLMs were evaluated on benchmarks like Head-to-Tail, SimpleQA, and CRAG. It was found that while confidence correlated with accuracy, the models were often overly optimistic about their correctness, with reported confidences often exceeding actual accuracy rates significantly.

Through extensive training, the ConfQA model was demonstrated to drastically minimize hallucinaries while exhibiting reliable transfer across different datasets, irrespective of domain. The details in the paper highlight that alternative strategies, such as those that focus solely on answer consistency or on larger datasets (e.g., using MMLU data) without the dampener, fail to achieve the proposed strategy's effectiveness.

Implications and Future Directions

The implications of the ConfQA strategy are vast. Practically, reducing hallucination rates translates directly to improved trust and applicability of LLMs in sensitive or high-stakes environments, like medical or legal sectors. The outcomes provoke questions regarding the design of future AI systems: Can LLMs be made inherently self-aware about their knowledge boundaries? Should AI constructs integrate self-regulation mechanisms akin to ConfQA as a standard?

Theoretically, this work opens avenues for further introspection into LLM architecture regarding parameterization of knowledge storage versus its retrieval capabilities. The Dual Neural Knowledge Framework proposed is indicative of future model architectures that blend computationally efficient neural networks with expansive, continually updated knowledge repositories.

The study also underscores the potential shifts needed in AI training methodologies to balance factual correctness with computational efficiency, indicating an area ripe for innovation. As LLMs scale and access to real-time external data becomes more seamless, the insights from this work will be pivotal in orienting future research directions.

In conclusion, "ConfQA: Answer Only If You Are Confident" makes a substantial contribution to improving factual reliability in LLMs, presenting an elegant solution to curb hallucinations and proposing a framework that extends the operational efficiency of AI systems in real-world applications.

Markdown Report Issue