Large Language Models Reflect the Ideology of their Creators
Abstract: LLMs are trained on vast amounts of data to generate natural language, enabling them to perform tasks like text summarization and question answering. These models have become popular in AI assistants like ChatGPT and already play an influential role in how humans access information. However, the behavior of LLMs varies depending on their design, training, and use. In this paper, we prompt a diverse panel of popular LLMs to describe a large number of prominent personalities with political relevance, in all six official languages of the United Nations. By identifying and analyzing moral assessments reflected in their responses, we find normative differences between LLMs from different geopolitical regions, as well as between the responses of the same LLM when prompted in different languages. Among only models in the United States, we find that popularly hypothesized disparities in political views are reflected in significant normative differences related to progressive values. Among Chinese models, we characterize a division between internationally- and domestically-focused models. Our results show that the ideological stance of an LLM appears to reflect the worldview of its creators. This poses the risk of political instrumentalization and raises concerns around technological and regulatory efforts with the stated aim of making LLMs ideologically 'unbiased'.
- \bibcommenthead
- Zhao, W.Ā X. etĀ al. A Survey of Large Language Models (2023). 2303.18223.
- News Summarization and Evaluation in the Era of GPT-3 (2023). 2209.12356.
- QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering (2022). 2104.06378.
- OpenAI. Introducing ChatGPT. https://openai.com/index/chatgpt/ (2022).
- ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? Journal of Applied Learning and Teaching 6, 342ā363 (2023).
- Chang, Y. etĀ al. A Survey on Evaluation of Large Language Models. ACM Trans. Intell. Syst. Technol. 15, 39:1ā39:45 (2024).
- Is Googleās Gemini chatbot woke by accident, or by design? https://www.economist.com/united-states/2024/02/28/is-googles-gemini-chatbot-woke-by-accident-or-design (2024).
- Quantifying and alleviating political bias in language models. Artificial Intelligence 304, 103654 (2022).
- Do large language models have a legal duty to tell the truth? Royal Society Open Science 11, 240197 (2024).
- Strzelecki, A. Is chatgpt-like technology going to replace commercial search engines? Library Hi Tech News (2024).
- Stumpf, S., Gajos, K. & Ruotsalo, T. (eds) Wordcraft: Story writing with large language models. (eds Stumpf, S., Gajos, K. & Ruotsalo, T.) Proceedings of the 27th International Conference on Intelligent User Interfaces, IUI ā22, 841ā852 (Association for Computing Machinery, New York, NY, USA, 2022). URL https://doi.org/10.1145/3490099.3511105.
- On Faithfulness and Factuality in Abstractive Summarization (2020). 2005.00661.
- TruthfulQA: Measuring How Models Mimic Human Falsehoods (2022). 2109.07958.
- Huang, Y. etĀ al. Salakhutdinov, R. etĀ al. (eds) Position: TrustLLM: Trustworthiness in Large Language Models. (eds Salakhutdinov, R. etĀ al.) Proceedings of the 41st International Conference on Machine Learning, 20166ā20270 (PMLR, 2024).
- Who is GPT-3? An Exploration of Personality, Values and Demographics (2022). 2209.14338.
- What does ChatGPT return about human values? Exploring value bias in ChatGPT using a descriptive value theory (2023). 2304.03612.
- Santurkar, S. etĀ al. Krause, A. etĀ al. (eds) Whose opinions do language models reflect? (eds Krause, A. etĀ al.) Proceedings of the 40th International Conference on Machine Learning, Vol. 202 of Proceedings of Machine Learning Research, 29971ā30004 (PMLR, 2023). URL https://proceedings.mlr.press/v202/santurkar23a.html.
- ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models (2024). 2406.04214.
- Choudhary, T. Political Bias in AI-Language Models: A Comparative Analysis of ChatGPT-4, Perplexity, Google Gemini, and Claude (2024). 2024071274.
- Retzlaff, N. Political Biases of ChatGPT in Different Languages (2024). 2024061224.
- Rozado, D. The political preferences of LLMs. PLOS ONE 19, e0306621 (2024).
- Röttger, P. et al. Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models (2024). 2402.16786.
- Are large language models consistent over value-laden questions? arXiv preprint arXiv:2407.02996 (2024).
- Foucault, M. Discipline and Punish: The Birth of the Prison (Vintage Books, New York, 1977).
- Gramsci, A. Selections from the Prison Notebooks (International Publishers, New York, 1971).
- Mouffe, C. Hegemony, radical democracy, and the political. edited by james martin. 1ª edição (2013).
- Large language models are not robust multiple choice selectors.
- Pantheon 1.0, a manually verified dataset of globally famous biographies. Scientific Data 3, 150075 (2016).
- Lehmann, P. etĀ al. The manifesto project dataset - codebook (2024).
- Biplots Vol.Ā 54 (CRC Press, 1995).
- in Survey Response Styles Across Cultures (eds Matsumoto, D. & van de Vijver, F. J.Ā R.) Cross-Cultural Research Methods in Psychology Culture and Psychology, 130ā176 (Cambridge University Press, Cambridge, 2010).
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.