Confabulation: The Surprising Value of Large Language Model Hallucinations

Published 6 Jun 2024 in cs.CL and cs.AI | (2406.04175v2)

Abstract: This paper presents a systematic defense of LLM hallucinations or 'confabulations' as a potential resource instead of a categorically negative pitfall. The standard view is that confabulations are inherently problematic and AI research should eliminate this flaw. In this paper, we argue and empirically demonstrate that measurable semantic characteristics of LLM confabulations mirror a human propensity to utilize increased narrativity as a cognitive resource for sense-making and communication. In other words, it has potential value. Specifically, we analyze popular hallucination benchmarks and reveal that hallucinated outputs display increased levels of narrativity and semantic coherence relative to veridical outputs. This finding reveals a tension in our usually dismissive understandings of confabulation. It suggests, counter-intuitively, that the tendency for LLMs to confabulate may be intimately associated with a positive capacity for coherent narrative-text generation.

Abstract PDF HTML Upgrade to Chat

Authors (4)

Citations (7)

View on Semantic Scholar

Summary

The paper redefines hallucinations as narrative-rich confabulations that exhibit higher narrativity and semantic coherence.
It employs ELECTRA-large models and logistic regression on datasets like FaithDial, BEGIN, and HaluEval to quantify narrative features.
The findings suggest that leveraging confabulations can improve text generation in creative, therapeutic, and narrative-driven applications.

Confabulation: The Surprising Value of LLM Hallucinations

Overview

The paper "Confabulation: The Surprising Value of LLM Hallucinations" penned by Peiqi Sui, Eamon Duede, Sophie Wu, and Richard Jean So, offers a critical reassessment of the commonly held negative perception of hallucinations in LLMs, suggesting that hallucinations, referred to as confabulations, may possess valuable semantic characteristics. The authors argue convincingly, substantiated by empirical evidence, that confabulations exhibit increased narrativity and semantic coherence, properties that could be advantageous for narrative-text generation.

Background

LLMs have entrenched their presence across various domains, but discussions around their hallucinations remain largely negative, considering them as significant ethical and safety pitfalls. Various studies and technical reports have deemed hallucinations a severe impediment to model trustworthiness, especially in truth-sensitive fields such as law, medicine, finance, science, and education.

In contrast to this normative stance, the authors of this paper suggest a reorientation of the concept of hallucinations towards that of confabulation. Utilizing insights from cognitive science and cultural analytics, they propose that hallucinations manifest higher levels of narrativity, an assertion supported by analyses of popular hallucination benchmarks.

Empirical Analysis

The paper meticulously analyzes three benchmark datasets: FaithDial, BEGIN, and HaluEval, to evaluate the narrativity of hallucinated versus factual outputs.

FaithDial: Adapted from the Wizard of Wikipedia benchmark and annotated for hallucinations.
BEGIN: A smaller, expert-curated dataset with a unique hallucination taxonomy.
HaluEval: A comprehensive dataset featuring hallucinated and ground truth ChatGPT responses.

The authors measure narrativity using an ELECTRA-large-based text-classification model. The empirical findings decisively reveal that hallucinated outputs consistently exhibit higher narrativity across all three datasets, as substantiated by logistic regression models showing a positive correlation between narrativity and hallucination.

Furthermore, the paper investigates the coherence of these outputs, utilizing the DEAM metric, and finds a significant association between higher narrativity and increased coherence, thus reinforcing the potential cognitive and communicative benefits of confabulations.

Defense of Confabulation

The authors present a robust argument for considering confabulations as narrative-rich constructs that align with human tendencies to employ narratives for sense-making and communication. They draw on the narrative paradigm (NP) and cognitive narratology to highlight that storytelling is intrinsic to human cognition and communication. NP posits that narratives are more persuasive and meaningful than structured arguments, with narrative coherence and fidelity being key metrics for assessing the effectiveness of communication.

The paper also explores the role of narratives in maintaining the coherence of internal world models, referencing cognitive linguistics and the semantics of possible worlds. This approach underscores the importance of narratives in scaffolding and navigating complex social and cognitive contexts.

The authors link these theoretical insights to practical applications, particularly in the medical domain, where narratives play a crucial role in patient care and rehabilitation. They argue that the narrative-rich properties of confabulations can offer significant cognitive and communicative benefits, comparable to those observed in human therapy and communication.

Implications and Future Research

The findings of this paper have profound implications for the development and deployment of LLMs. By reorienting the understanding of hallucinations toward confabulations, researchers can explore new avenues for leveraging the narrative capacities of LLMs. This perspective opens up possibilities for enhancing user experience in diverse fields beyond factual text generation, such as creative writing, journalism, and therapeutic applications.

Future research could further investigate the utility of confabulations across various domains and validate the hypothesized benefits through human-based evaluations. Exploring the balance between creativity and factuality in LLM outputs could lead to optimized models that better serve the nuanced needs of different applications.

Conclusion

In "Confabulation: The Surprising Value of LLM Hallucinations," the authors provide a compelling, empirically-backed argument that challenges the traditional view of hallucinations as a purely negative phenomenon. By demonstrating the narrative-rich and coherent properties of confabulations, they pave the way for a more nuanced understanding and innovative utilization of LLM capabilities. This paper serves as a foundation for future research and development aimed at harnessing the potential cognitive and communicative benefits of LLM-generated narratives.

Markdown Report Issue