Papers
Topics
Authors
Recent
Search
2000 character limit reached

Do Large Language Models Learn Human-Like Strategic Preferences?

Published 11 Apr 2024 in cs.GT and cs.AI | (2404.08710v2)

Abstract: In this paper, we evaluate whether LLMs learn to make human-like preference judgements in strategic scenarios as compared with known empirical results. Solar and Mistral are shown to exhibit stable value-based preference consistent with humans and exhibit human-like preference for cooperation in the prisoner's dilemma (including stake-size effect) and traveler's dilemma (including penalty-size effect). We establish a relationship between model size, value-based preference, and superficiality. Finally, results here show that models tending to be less brittle have relied on sliding window attention suggesting a potential link. Additionally, we contribute a novel method for constructing preference relations from arbitrary LLMs and support for a hypothesis regarding human behavior in the traveler's dilemma.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Playing repeated games with large language models. arXiv preprint arXiv:2305.16867.
  2. Identifying nontransitive preferences. Technical report, Working Paper.
  3. Basu, K. 1994. The traveler’s dilemma: Paradoxes of rationality in game theory. The American Economic Review, 84(2): 391–395.
  4. Experts Playing the Traveler’s Dilemma. 1–20.
  5. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150.
  6. On the computational power of transformers and its implications in sequence modeling. arXiv preprint arXiv:2006.09286.
  7. Using cognitive psychology to understand GPT-3. Proceedings of the National Academy of Sciences, 120(6): e2218523120.
  8. Can Large Language Models Serve as Rational Players in Game Theory? A Systematic Analysis. Proceedings of the AAAI Conference on Artificial Intelligence, 38(16): 17960–17967.
  9. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, 1050–1059. PMLR.
  10. Revealed-preference analysis with framing effects. Journal of Political Economy, 128(7): 2759–2795.
  11. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300.
  12. Applied statistics for the behavioral sciences, volume 663. Houghton Mifflin Boston.
  13. Phi-2: The surprising power of small language models. Microsoft Research Blog.
  14. Mistral 7B. arXiv preprint arXiv:2310.06825.
  15. Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166.
  16. Characterising transitive two-sample tests. Statistics & Probability Letters, 109: 118–123.
  17. Pythagorean fuzzy preference relations and their applications in group decision-making systems. International Journal of Intelligent Systems, 34(7): 1700–1717.
  18. Noisy Channel Language Model Prompting for Few-Shot Text Classification. In Muresan, S.; Nakov, P.; and Villavicencio, A., eds., Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 5316–5330. Dublin, Ireland: Association for Computational Linguistics.
  19. Do language models learn typicality judgments from text? arXiv preprint arXiv:2105.02987.
  20. Individual and group behaviour in the traveler’s dilemma: An experimental study. Journal of Behavioral and Experimental Economics, 49: 1–7.
  21. A Course in Game Theory., volume 63. ISBN 0262650401.
  22. Nash, J. F.; et al. 1950. Non-cooperative games.
  23. OpenAI. 2023. GPT-4 Technical Report. ArXiv, abs/2303.08774.
  24. On the turing completeness of modern neural network architectures. arXiv preprint arXiv:1901.03429.
  25. Roberts, J. 2021. Finding an Equilibrium in the Traveler’s Dilemma with Fuzzy Weak Domination. In 2021 IEEE Conference on Games (CoG), 1–5. IEEE.
  26. Roberts, J. 2024. How Powerful are Decoder-Only Transformer Neural Models? arXiv:2305.17026.
  27. Using Artificial Populations to Study Psychological Phenomena in Neural Models. Proceedings of the AAAI Conference on Artificial Intelligence, 38(17): 18906–18914.
  28. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. arXiv preprint arXiv:2206.04615.
  29. Do Large Language Models Show Decision Heuristics Similar to Humans? A Case Study Using GPT-3.5. arXiv preprint arXiv:2305.04400.
  30. Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295.
  31. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  32. Do Large Language Models know what humans know? Cognitive Science, 47(7): e13309.
  33. Ullman, T. 2023. Large language models fail on trivial alterations to theory-of-mind tasks. arXiv preprint arXiv:2302.08399.
  34. Attention is all you need. Advances in neural information processing systems, 30.
  35. Avalon’s Game of Thoughts: Battle Against Deception through Recursive Contemplation. arXiv preprint arXiv:2310.01320.
  36. Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903.
  37. Moral bargain hunters purchase moral righteousness when it is cheap: within-individual effect of stake size in economic games. Scientific Reports, 6(1): 27824.
  38. Agieval: A human-centric benchmark for evaluating foundation models. arXiv preprint arXiv:2304.06364.
Citations (3)

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 15 likes about this paper.