Strategic Interactions between Large Language Models-based Agents in Beauty Contests
Abstract: The growing adoption of LLMs presents potential for deeper understanding of human behaviours within game theory frameworks. Addressing research gap on multi-player competitive games, this paper examines the strategic interactions among multiple types of LLM-based agents in a classical beauty contest game. LLM-based agents demonstrate varying depth of reasoning that fall within a range of level-0 to 1, which are lower than experimental results conducted with human subjects, but they do display similar convergence pattern towards Nash Equilibrium (NE) choice in repeated setting. Further, through variation in group composition of agent types, I found environment with lower strategic uncertainty enhances convergence for LLM-based agents, and having a mixed environment comprises of LLM-based agents of differing strategic levels accelerates convergence for all. Higher average payoffs for the more intelligent agents are usually observed, albeit at the expense of less intelligent agents. The results from game play with simulated agents not only convey insights on potential human behaviours under specified experimental set-ups, they also offer valuable understanding of strategic interactions among algorithms.
- Using large language models to simulate multiple humans and replicate human subject studies. In International Conference on Machine Learning, pages 337–371. PMLR.
- Playing repeated games with large language models. arXiv preprint arXiv:2305.16867.
- Out of one, many: Using language models to simulate human samples. Political Analysis, 31(3):337–351.
- One, two,(three), infinity,…: Newspaper and lab beauty-contest experiments. American Economic Review, 92(5):1687–1701.
- Competition in pricing algorithms. American Economic Journal: Microeconomics, 15(2):109–156.
- A cognitive hierarchy model of games. The Quarterly Journal of Economics, 119(3):861–898.
- An empirical analysis of algorithmic pricing on amazon marketplace. In Proceedings of the 25th international conference on World Wide Web, pages 1339–1349.
- Chollet, F. (2021). Deep learning with Python. Simon and Schuster.
- Neural correlates of depth of strategic reasoning in medial prefrontal cortex. Proceedings of the National Academy of Sciences, 106(23):9163–9168.
- Stated beliefs and play in normal-form games. The Review of Economic Studies, 75(3):729–762.
- An eye-tracking study of feature-based choice in one-shot games. Experimental Economics, 19:177–201.
- Can ai language models replace human participants? Trends in Cognitive Sciences.
- Facing the grim truth: Repeated prisoner’s dilemma against robot opponents. Technical report, Working Paper.
- Can large language models serve as rational players in game theory? a systematic analysis. arXiv preprint arXiv:2312.05488.
- Goldberg, D. (1991). What every computer scientist should know about floating-point arithmetic. ACM computing surveys (CSUR), 23(1):5–48.
- Guo, F. (2023). Gpt in game theory experiments. arXiv:2305.05516.
- Economics arena for large language models. arXiv preprint arXiv:2401.01735.
- Horton, J. J. (2023). Large language models as simulated economic agents: What can we learn from homo silicus? Technical report, National Bureau of Economic Research.
- A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:2311.05232.
- HuggingFace (2022). Illustrating reinforcement learning from human feedback (rlhf).
- Large language models show human behavior.
- IBM (2024). Tokens and tokenization. Accessed: 2024-04-10.
- Raising standards: Is ability grouping the answer? Oxford review of education, 25(3):343–358.
- Students’ contrasting their experiences of teacher expectations in streamed and mixed ability classes: A study of grade 10 students in western australia. Research Papers in Education, 38(4):543–567.
- Student perceptions of their learning experience in streamed and mixed-ability classes. Language Education in Asia, 1(1):215–227.
- The effect of the question on survey responses: A review. Journal of the Royal Statistical Society Series A: Statistics in Society, 145(1):42–57.
- Keynes, J. M. (1936). The general theory of interest, employment and money.
- Kosinski, M. (2023). Theory of mind may have spontaneously emerged in large language models. arXiv preprint arXiv:2302.02083.
- Labellerr (2024). Training small-scale vs large-scale language models: The difference.
- The big-fish-little-pond effect and a national policy of within-school ability streaming: Alternative frames of reference. American Educational Research Journal, 50(2):326–370.
- Levels of reasoning in keynesian beauty contests: a generative framework. In Handbook of computational economics, volume 4, pages 541–634. Elsevier.
- A turing test of whether ai chatbots are behaviorally similar to humans. Proceedings of the National Academy of Sciences, 121(9):e2313925121.
- Nagel, R. (1995). Unraveling in guessing games: An experimental study. The American economic review, 85(5):1313–1326.
- Inspired and inspiring: Hervé moulin and the discovery of the beauty contest game. Mathematical Social Sciences, 90:191–207.
- OpenAI (2024). How chatgpt and our language models are developed.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
- Investigating emergent goal-like behaviour in large language models using experimental economics. arXiv preprint arXiv:2305.07970.
- Intelligence, errors, and cooperation in repeated interactions. The Review of Economic Studies, 89(5):2723–2767.
- Biological constraints on neural network models of cognitive function. Nature Reviews Neuroscience, 22(8):488–502.
- Quantifying language models’ sensitivity to spurious features in prompt design or: How i learned to start worrying about prompt formatting. arXiv preprint arXiv:2310.11324.
- Trality (2024). Crypto trading bots: The ultimate beginner’s guide.
- The framing of decisions and the psychology of choice. science, 211(4481):453–458.
- Emergent analogical reasoning in large language models. Nature Human Behaviour, 7(9):1526–1541.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.