Quantifying the Persona Effect in LLM Simulations

Published 16 Feb 2024 in cs.CL and cs.CY | (2402.10811v2)

Abstract: LLMs have shown remarkable promise in simulating human language and behavior. This study investigates how integrating persona variables-demographic, social, and behavioral factors-impacts LLMs' ability to simulate diverse perspectives. We find that persona variables account for <10% variance in annotations in existing subjective NLP datasets. Nonetheless, incorporating persona variables via prompting in LLMs provides modest but statistically significant improvements. Persona prompting is most effective in samples where many annotators disagree, but their disagreements are relatively minor. Notably, we find a linear relationship in our setting: the stronger the correlation between persona variables and human annotations, the more accurate the LLM predictions are using persona prompting. In a zero-shot setting, a powerful 70b model with persona prompting captures 81% of the annotation variance achievable by linear regression trained on ground truth annotations. However, for most subjective NLP datasets, where persona variables have limited explanatory power, the benefits of persona prompting are limited.

Abstract PDF Upgrade to Chat

Citations (28)

View on Semantic Scholar

Summary

The paper demonstrates that persona variables account for less than 10% of annotation variance using mixed-effect regression models.
The study shows that persona prompting yields modest improvements in LLM predictions, especially in datasets with low variability.
The authors recommend cautious use of persona prompting and advocate for improved dataset designs to better capture diverse perspectives.

Quantifying the Persona Effect in LLM Simulations

Introduction

The paper "Quantifying the Persona Effect in LLM Simulations" presents an in-depth exploration of how persona variables influence the capability of LLMs in simulating human language use and behavior. This is particularly relevant in subjective NLP tasks where annotations often reflect low inter-annotator agreement due to their inherent subjectivity.

Methodology

The authors utilize a mixed-effect linear regression model to assess the explanatory power of persona variables on annotation variance across various datasets. The model segregates the variance attributable to persona variables from the inherent variability in the text samples annotated, leveraging a random intercept to control for text effects.

Findings on Variance Explanation

The study reveals that persona variables explain less than 10% of the variance in human annotations across a variety of NLP datasets, a finding underscored by conditional and marginal R-squared values. This contrasts with a higher R-squared value in contexts like the ANES survey dataset, where more predictable behavioral patterns are observed.

Impact of Persona Prompting

Persona prompting, as explored in this study, involves prepending text samples with details about persona variables in a zero-shot LLM setting. While persona prompting demonstrates modest improvements across some models and tasks, these gains are not consistent and remain limited. Most notably, persona prompting seems beneficial in contexts where annotator disagreements, though frequent, are narrowly ranged.

Figure 1: Illustration of persona prompting used to investigate LLMs' capacity to simulate diverse perspectives.

Sample Types and Persona Prompting Utility

The efficacy of persona prompting was particularly highlighted in datasets characterized by high entropy yet low standard deviation in annotations. Here, LLMs could adjust predictions to align more closely with individual personas. The study's detailed analysis suggests that improvements from persona prompting are most pronounced in samples exhibiting frequent disagreements among annotators within confined ranges.

Controlled Text Randomness and Persona Utility

Using the ANES dataset to control for text randomness, the study investigates LLMs’ abilities to simulate personas when persona utilities vary. The results indicate a near-linear correlation between the extent to which persona factors influence human responses and LLM predictions. Larger models, fine-tuned with persona prompting, captured up to 81% of the variance found in human responses for high-utility scenarios.

Figure 2: Mean improvement in MAE with persona prompting across models and datasets with varying annotation entropy and standard deviation.

Implications and Recommendations

The findings suggest that while persona prompting can marginally enhance LLM performance, it is constrained by the limited variance that persona variables explain in most existing datasets. Consequently, the authors recommend caution in employing LLMs for simulating diverse perspectives without validation. They also advocate for more strategic dataset design to advance the incorporation of nuanced persona variables in future research endeavors.

Conclusion

This paper provides a nuanced understanding of the limitations and potential of persona prompting in LLM simulations. It emphasizes the need for more complex and comprehensive persona information to effectively utilize LLMs in capturing diverse perspectives across NLP tasks. The insights from this study pave the way for future research in refining LLM simulations through enhanced dataset designs and more sophisticated prompting methodologies.