Papers
Topics
Authors
Recent
Search
2000 character limit reached

Strategic Interactions between Large Language Models-based Agents in Beauty Contests

Published 12 Apr 2024 in econ.GN, physics.soc-ph, and q-fin.EC | (2404.08492v2)

Abstract: The growing adoption of LLMs presents potential for deeper understanding of human behaviours within game theory frameworks. Addressing research gap on multi-player competitive games, this paper examines the strategic interactions among multiple types of LLM-based agents in a classical beauty contest game. LLM-based agents demonstrate varying depth of reasoning that fall within a range of level-0 to 1, which are lower than experimental results conducted with human subjects, but they do display similar convergence pattern towards Nash Equilibrium (NE) choice in repeated setting. Further, through variation in group composition of agent types, I found environment with lower strategic uncertainty enhances convergence for LLM-based agents, and having a mixed environment comprises of LLM-based agents of differing strategic levels accelerates convergence for all. Higher average payoffs for the more intelligent agents are usually observed, albeit at the expense of less intelligent agents. The results from game play with simulated agents not only convey insights on potential human behaviours under specified experimental set-ups, they also offer valuable understanding of strategic interactions among algorithms.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. Using large language models to simulate multiple humans and replicate human subject studies. In International Conference on Machine Learning, pages 337–371. PMLR.
  2. Playing repeated games with large language models. arXiv preprint arXiv:2305.16867.
  3. Out of one, many: Using language models to simulate human samples. Political Analysis, 31(3):337–351.
  4. One, two,(three), infinity,…: Newspaper and lab beauty-contest experiments. American Economic Review, 92(5):1687–1701.
  5. Competition in pricing algorithms. American Economic Journal: Microeconomics, 15(2):109–156.
  6. A cognitive hierarchy model of games. The Quarterly Journal of Economics, 119(3):861–898.
  7. An empirical analysis of algorithmic pricing on amazon marketplace. In Proceedings of the 25th international conference on World Wide Web, pages 1339–1349.
  8. Chollet, F. (2021). Deep learning with Python. Simon and Schuster.
  9. Neural correlates of depth of strategic reasoning in medial prefrontal cortex. Proceedings of the National Academy of Sciences, 106(23):9163–9168.
  10. Stated beliefs and play in normal-form games. The Review of Economic Studies, 75(3):729–762.
  11. An eye-tracking study of feature-based choice in one-shot games. Experimental Economics, 19:177–201.
  12. Can ai language models replace human participants? Trends in Cognitive Sciences.
  13. Facing the grim truth: Repeated prisoner’s dilemma against robot opponents. Technical report, Working Paper.
  14. Can large language models serve as rational players in game theory? a systematic analysis. arXiv preprint arXiv:2312.05488.
  15. Goldberg, D. (1991). What every computer scientist should know about floating-point arithmetic. ACM computing surveys (CSUR), 23(1):5–48.
  16. Guo, F. (2023). Gpt in game theory experiments. arXiv:2305.05516.
  17. Economics arena for large language models. arXiv preprint arXiv:2401.01735.
  18. Horton, J. J. (2023). Large language models as simulated economic agents: What can we learn from homo silicus? Technical report, National Bureau of Economic Research.
  19. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:2311.05232.
  20. HuggingFace (2022). Illustrating reinforcement learning from human feedback (rlhf).
  21. Large language models show human behavior.
  22. IBM (2024). Tokens and tokenization. Accessed: 2024-04-10.
  23. Raising standards: Is ability grouping the answer? Oxford review of education, 25(3):343–358.
  24. Students’ contrasting their experiences of teacher expectations in streamed and mixed ability classes: A study of grade 10 students in western australia. Research Papers in Education, 38(4):543–567.
  25. Student perceptions of their learning experience in streamed and mixed-ability classes. Language Education in Asia, 1(1):215–227.
  26. The effect of the question on survey responses: A review. Journal of the Royal Statistical Society Series A: Statistics in Society, 145(1):42–57.
  27. Keynes, J. M. (1936). The general theory of interest, employment and money.
  28. Kosinski, M. (2023). Theory of mind may have spontaneously emerged in large language models. arXiv preprint arXiv:2302.02083.
  29. Labellerr (2024). Training small-scale vs large-scale language models: The difference.
  30. The big-fish-little-pond effect and a national policy of within-school ability streaming: Alternative frames of reference. American Educational Research Journal, 50(2):326–370.
  31. Levels of reasoning in keynesian beauty contests: a generative framework. In Handbook of computational economics, volume 4, pages 541–634. Elsevier.
  32. A turing test of whether ai chatbots are behaviorally similar to humans. Proceedings of the National Academy of Sciences, 121(9):e2313925121.
  33. Nagel, R. (1995). Unraveling in guessing games: An experimental study. The American economic review, 85(5):1313–1326.
  34. Inspired and inspiring: Hervé moulin and the discovery of the beauty contest game. Mathematical Social Sciences, 90:191–207.
  35. OpenAI (2024). How chatgpt and our language models are developed.
  36. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
  37. Investigating emergent goal-like behaviour in large language models using experimental economics. arXiv preprint arXiv:2305.07970.
  38. Intelligence, errors, and cooperation in repeated interactions. The Review of Economic Studies, 89(5):2723–2767.
  39. Biological constraints on neural network models of cognitive function. Nature Reviews Neuroscience, 22(8):488–502.
  40. Quantifying language models’ sensitivity to spurious features in prompt design or: How i learned to start worrying about prompt formatting. arXiv preprint arXiv:2310.11324.
  41. Trality (2024). Crypto trading bots: The ultimate beginner’s guide.
  42. The framing of decisions and the psychology of choice. science, 211(4481):453–458.
  43. Emergent analogical reasoning in large language models. Nature Human Behaviour, 7(9):1526–1541.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 0 likes about this paper.