Context versus Prior Knowledge in Language Models

Published 6 Apr 2024 in cs.CL | (2404.04633v3)

Abstract: To answer a question, LLMs often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to higher exposure in the training corpus, and be more easily persuaded by some contexts than others. To formalize this problem, we propose two mutual information-based metrics to measure a model's dependency on a context and on its prior about an entity: first, the persuasion score of a given context represents how much a model depends on the context in its decision, and second, the susceptibility score of a given entity represents how much the model can be swayed away from its original answer distribution about an entity. We empirically test our metrics for their validity and reliability. Finally, we explore and find a relationship between the scores and the model's expected familiarity with an entity, and provide two use cases to illustrate their benefits.

Abstract PDF HTML Upgrade to Chat

Citations (7)

View on Semantic Scholar

Summary

The paper introduces mutual information-based persuasion and susceptibility scores to quantify context influence and entity vulnerability in language models.
It empirically shows that relevant contexts are more persuasive, while entities frequently seen during training lean on prior knowledge.
The study’s analyses on friend-enemy pairs and gender bias reveal practical methods for enhancing model reliability and mitigating bias.

Evaluating the Influence of Context and Prior Knowledge on LLMs Through Persuasion and Susceptibility Scores

Overview

In the field of NLP, understanding how LMs integrate prior knowledge and context to answer queries is crucial. Recent research has explored this by introducing two novel mutual information-based metrics: persuasion score and susceptibility score. These metrics aim to quantify how contexts and entities influence the model's decision-making process. Utilizing a dataset synthesized from the YAGO knowledge graph covering 122 topics, the study examines the behavior of pretrained models across a spectrum of contexts and entities, offering insights into the models' reliance on prelearned information versus new input. Additionally, case studies on friend-enemy stance measurement and gender bias are presented to demonstrate the practical applications of these metrics.

Theoretical Foundation and Metric Definition

The paper presents a solid theoretical foundation for the assessment of how LMs depend on context and prior knowledge when answering questions. The persuasion score measures the degree to which a context alters the model's answer distribution for a query about an entity, reflecting the context's impact. On the other hand, the susceptibility score quantifies the extent to which an entity's answer distribution can be influenced, indicating the entity's vulnerability to being swayed from its original response due to context. These metrics, grounded in mutual information theory, offer a robust method for investigating the nuanced dynamics of LLMs' response mechanisms.

Empirical Validation

The metrics are empirically validated using an extensive dataset and different sizes of pretrained models. The research finds that:

Relevant contexts are generally more persuasive than irrelevant ones.
Entities appearing more frequently in training data exhibit lower susceptibility scores, indicating a stronger reliance on prior knowledge.
Assertive contexts and the inclusion of negation influence persuasiveness, albeit varying across different query types and model sizes.

These findings underline the metrics' reliability and validity in capturing the influence of context and prior knowledge on LMs.

Practical Implications and Further Analysis

The study also explores practical applications and implications of the research:

Identifying differences in susceptibility scores for entities known to the model versus unfamiliar ones reveals how prior exposure affects model behavior.
By analyzing friend-enemy pairs and gendered names in specific contexts, the metrics provide insight into potential biases within the models, demonstrating their utility in assessing fairness and bias.
The research raises pertinent questions about how models incorporate new information, suggesting areas for future exploration, such as the optimization of input context for improved model performance and the development of techniques for mitigating unwanted biases.

Conclusion

This research contributes significantly to our understanding of how LLMs process and integrate different types of information. By introducing and validating the persuasion and susceptibility scores, it provides a nuanced framework for analyzing the decision-making processes of LMs, offering a pathway towards more interpretable and controllable AI systems. The implications of this study are far-reaching, not only in advancing theoretical knowledge but also in practical applications for enhancing model reliability and mitigating bias. Future work, as suggested by the authors, could extend these metrics to broader contexts, further refining our understanding of AI decision-making and its implications in real-world applications.

Markdown Report Issue