Quantifying LLM Hallucination
Develop reliable and reproducible evaluation methodologies that accurately quantify hallucination rates in large language models, particularly for document-grounded question answering and other long-form contexts.
References
Quantifying LLM hallucination remains an open challenge.
— How Much Do LLMs Hallucinate in Document Q&A Scenarios? A 172-Billion-Token Study Across Temperatures, Context Lengths, and Hardware Platforms
(2603.08274 - Roig, 9 Mar 2026) in Section 2.1 (Related Work: Hallucination Measurement)