Large Language Models are Biased Reinforcement Learners
Abstract: In-context learning enables LLMs to perform a variety of tasks, including learning to make reward-maximizing choices in simple bandit tasks. Given their potential use as (autonomous) decision-making agents, it is important to understand how these models perform such reinforcement learning (RL) tasks and the extent to which they are susceptible to biases. Motivated by the fact that, in humans, it has been widely documented that the value of an outcome depends on how it compares to other local outcomes, the present study focuses on whether similar value encoding biases apply to how LLMs encode rewarding outcomes. Results from experiments with multiple bandit tasks and models show that LLMs exhibit behavioral signatures of a relative value bias. Adding explicit outcome comparisons to the prompt produces opposing effects on performance, enhancing maximization in trained choice sets but impairing generalization to new choice sets. Computational cognitive modeling reveals that LLM behavior is well-described by a simple RL algorithm that incorporates relative values at the outcome encoding stage. Lastly, we present preliminary evidence that the observed biases are not limited to fine-tuned LLMs, and that relative value processing is detectable in the final hidden layer activations of a raw, pretrained model. These findings have important implications for the use of LLMs in decision-making applications.
- Using large language models to simulate multiple humans and replicate human subject studies. In International Conference on Machine Learning, pages 337–371. PMLR, 2023.
- The functional form of value normalization in human reinforcement learning. Elife, 12:e83891, 2023.
- Reference-point centering and range-adaptation enhance human reinforcement learning at the cost of irrational preferences. Nature communications, 9(1):4503, 2018.
- Two sides of the same coin: Beneficial and detrimental consequences of range adaptation in human reinforcement learning. Science Advances, 7(14):eabe0340, 2021.
- Using cognitive psychology to understand gpt-3. Proceedings of the National Academy of Sciences, 120(6):e2218523120, 2023.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Survey on large language model-enhanced reinforcement learning: Concept, taxonomy, and methods. arXiv preprint arXiv:2404.00282, 2024.
- The emergence of economic rationality of gpt. Proceedings of the National Academy of Sciences, 120(51):e2316205120, 2023.
- Deep reinforcement learning from human preferences. Advances in neural information processing systems, 30, 2017.
- Inducing anxiety in large language models increases exploration and bias. arXiv preprint arXiv:2304.11111, 2023.
- Cogbench: a large language model walks into a psychology lab. arXiv preprint arXiv:2402.18225, 2024.
- Gary Fowler. The transformative power of generative ai and large language models, Apr 2024. URL https://www.forbes.com/sites/forbesbusinessdevelopmentcouncil/2024/04/08/the-transformative-power-of-generative-ai-and-large-language-models/?sh=562695e977b0.
- Thilo Hagendorff. Machine psychology: Investigating emergent capabilities and behavior in large language models using psychological methods. arXiv preprint arXiv:2303.13988, 2023.
- Regret in experience-based decisions: The effects of expected value differences and mixed gains and losses. Decision, 8(4):277, 2021.
- Reinforcement learning in and out of context: The effects of attentional focus. Journal of Experimental Psychology: Learning, Memory, and Cognition, 2022.
- Effects of blocked versus interleaved training on relative value learning. Psychonomic Bulletin & Review, 30(5):1895–1907, 2023a.
- Testing models of context-dependent outcome encoding in reinforcement learning. Cognition, 230:105280, 2023b.
- Relative value biases in large language models. arXiv preprint arXiv:2401.14530, 2024.
- Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300, 2020.
- John J Horton. Large language models as simulated economic agents: What can we learn from homo silicus? Technical report, National Bureau of Economic Research, 2023.
- Mixtral of experts. arXiv preprint arXiv:2401.04088, 2024.
- Human value learning and representation reflect rational adaptation to task demands. Nature Human Behaviour, 6(9):1268–1279, 2022.
- Learning relative values in the striatum induces violations of normative decision making. Nature communications, 8(1):16033, 2017.
- Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
- Behavioural and neural characterization of optimistic reinforcement learning. Nature Human Behaviour, 1(4):0067, 2017.
- Code as policies: Language model programs for embodied control. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 9493–9500. IEEE, 2023.
- Intrinsic rewards explain context-sensitive valuation in reinforcement learning. PLoS Biology, 21(7):e3002201, 2023.
- Cal Newport. What kind of mind does chatgpt have?, Apr 2023. URL https://www.newyorker.com/science/annals-of-artificial-intelligence/what-kind-of-mind-does-chatgpt-have.
- OpenAI. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
- Stephen Ornes. How quickly do large language models learn unexpected skills?, Feb 2024. URL https://www.quantamagazine.org/how-quickly-do-large-language-models-learn-unexpected-skills-20240213/.
- Context-dependent outcome encoding in human reinforcement learning. Current Opinion in Behavioral Sciences, 41:144–151, 2021.
- Contextual modulation of value signals in reward and punishment learning. Nature communications, 6(1):8096, 2015.
- Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing. PLoS computational biology, 13(8):e1005684, 2017.
- Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pages 1–22, 2023.
- Direct preference optimization: Your language model is secretly a reward model. Advances in Neural Information Processing Systems, 36, 2024.
- Robert A Rescorla. A theory of pavlovian conditioning: Variations in the effectiveness of reinforcement and non-reinforcement. Classical conditioning, Current research and theory, 2:64–69, 1972.
- In-context learning agents are asymmetric belief updaters. arXiv preprint arXiv:2402.03969, 2024.
- Languagempc: Large language models as decision makers for autonomous driving. arXiv preprint arXiv:2310.03026, 2023.
- Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. arXiv preprint arXiv:2206.04615, 2022.
- Reinforcement learning: An introduction. MIT press, 2018.
- Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295, 2024.
- Large language models in medicine. Nature medicine, 29(8):1930–1940, 2023.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
- Contextual influence of reinforcement learning performance of depression: evidence for a negativity bias? Psychological Medicine, 53(10):4696–4706, 2023.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- A survey on large language model based autonomous agents. Frontiers of Computer Science, 18(6):1–26, 2024.
- Emergent abilities of large language models. arXiv preprint arXiv:2206.07682, 2022a.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837, 2022b.
- The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864, 2023.
- Studying and improving reasoning in humans and machines. arXiv preprint arXiv:2309.12485, 2023.
- Judging llm-as-a-judge with mt-bench and chatbot arena. Advances in Neural Information Processing Systems, 36, 2024.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.