LLM Generated Persona is a Promise with a Catch
Abstract: The use of LLMs to simulate human behavior has gained significant attention, particularly through personas that approximate individual characteristics. Persona-based simulations hold promise for transforming disciplines that rely on population-level feedback, including social science, economic analysis, marketing research, and business operations. Traditional methods to collect realistic persona data face significant challenges. They are prohibitively expensive and logistically challenging due to privacy constraints, and often fail to capture multi-dimensional attributes, particularly subjective qualities. Consequently, synthetic persona generation with LLMs offers a scalable, cost-effective alternative. However, current approaches rely on ad hoc and heuristic generation techniques that do not guarantee methodological rigor or simulation precision, resulting in systematic biases in downstream tasks. Through extensive large-scale experiments including presidential election forecasts and general opinion surveys of the U.S. population, we reveal that these biases can lead to significant deviations from real-world outcomes. Our findings underscore the need to develop a rigorous science of persona generation and outline the methodological innovations, organizational and institutional support, and empirical foundations required to enhance the reliability and scalability of LLM-driven persona simulations. To support further research and development in this area, we have open-sourced approximately one million generated personas, available for public access and analysis at https://huggingface.co/datasets/Tianyi-Lab/Personas.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
What this paper is about
This paper looks at using AI LLMs to make “personas” — detailed profiles that act like pretend people — and then using those personas to simulate how real groups of people might think, vote, or shop. The big idea is exciting: if AI can stand in for lots of people, we could test opinions and decisions quickly and cheaply. But the authors show there’s a catch: the way these AI personas are made can introduce strong biases that push the simulated results away from what real people actually do.
The key questions the paper asks
- Can AI-made personas accurately reflect what real populations think?
- How do different ways of creating personas change the results?
- Do AI personas drift toward certain opinions (like political leanings) as we add more AI-generated details?
- What steps are needed to make persona generation reliable and trustworthy?
How the researchers did it (in plain terms)
To study this, the authors created and tested different types of personas and asked LLMs to answer questions as if they were those personas. Think of personas like character sheets in a game: age, job, beliefs, hobbies, and more. Then they asked the AI, “Given this character, which option would they pick for this question?”
How personas were built
The team compared three persona styles that add more AI-generated content step by step:
- Meta Personas: Built from real U.S. Census-style data (age, sex, race, state). No AI creativity here. Think of it as the basic stats of a person, sampled to match real population patterns.
- Tabular Personas:
- Objective Tabular: Start with Meta Personas, then use an AI to fill in more “hard facts” like education level or occupation, but only from fixed categories (to reduce bias).
- Subjective Tabular: Add more personal, subjective details (like political views or hobbies) with open-ended AI generation.
- Descriptive Personas: The AI writes a freeform, story-like description of a person. This is the most creative and flexible option — and the most AI-generated.
How opinions were simulated
- The AI was given a persona and a multiple-choice question (like a survey).
- The AI had to pick the answer that best fits that persona.
- They did this at large scale: around one million personas, six different open-source LLMs, and 500+ questions across many topics.
How results were checked
- For elections (2016, 2020, 2024), they compared the simulated results with actual election outcomes by measuring how close the AI’s state-by-state predictions were to reality.
- For general opinions (like on climate, entertainment, education, and tech), they compared simulated answers to real survey distributions (from datasets like OpinionQA), and also looked for patterns when no ground truth was available.
What they found and why it matters
Here are the main takeaways:
- More AI-generated detail led to more bias. As they moved from Meta → Tabular → Descriptive personas, the simulated answers drifted further from real-world results.
- Political drift toward one side. In some setups, the simulations swung so far left that they predicted a Democratic win in every U.S. state. The overall trend: the more the AI “invented” about people, the more it leaned to progressive opinions. (They also found one special model that leaned right, showing that different AIs can have different slants — but the general problem still stands.)
- Bias showed up beyond politics. Across many topics, AI-generated personas increasingly preferred:
- “Greener but pricier” products over cheaper non-eco options,
- Liberal arts and humanities over STEM majors,
- Certain “prestige” or artsy movies over mainstream action films.
- The personas themselves looked unusually sunny and subjective. When the AI wrote longer descriptions, the language became more positive and more emotional (“love,” “proud,” “community”), and it rarely included hardships or negative experiences. That upbeat tone can push simulated answers in specific directions on social issues.
- This is not just a single-model problem. Testing different LLMs and mixing persona generators with different simulators showed similar issues. The bias comes from how personas are generated, not only from how the AI answers questions.
Why this matters: If companies, researchers, or policymakers use these biased “silicon samples” to guide decisions, they could be misled. For example, a car company might overestimate the market for expensive eco-friendly models, or a studio might misjudge movie preferences.
What this means going forward
The authors argue that we need a careful, scientific approach to AI persona generation so results can be trusted. They suggest:
- Figure out which persona details truly matter. Not every trait is equally useful; we need to identify the key attributes (demographic, beliefs, behavior, context) and how to present them to the AI.
- Calibrate personas to real populations. Because real data often shows only single-trait percentages (like “X% have a college degree”) rather than how traits combine (like “college degree + age + job + state together”), new methods are needed to rebuild realistic combinations and correct the AI’s output to match target populations.
- Build shared benchmarks and datasets. The field needs open, high-quality test sets (like “ImageNet for personas”) so everyone can measure progress fairly and improve methods over time.
- Collaborate across fields. AI researchers, social scientists, economists, and industry should work together to design safe, fair, and useful simulations.
To help others study this problem, the authors released about one million generated personas for public use.
Overall message: AI-made personas are powerful but risky. They can massively speed up research and testing, but without careful design and checks, they can give answers that look realistic yet quietly drift away from the real world. The promise is big — but it comes with a catch we need to fix.
Collections
Sign up for free to add this paper to one or more collections.