Human and Machine Intelligence in n-Person Games with Partial Knowledge: Theory and Computation

Published 27 Feb 2023 in econ.TH and cs.GT | (2302.13937v4)

Abstract: In this paper, I formalize intelligence measurement in games by introducing mechanisms that assign a real number -- interpreted as an intelligence score -- to each player in a game. This score quantifies the ex-post strategic ability of the players based on empirically observable information, such as the actions of the players, the game's outcome, strength of the players, and a reference oracle machine such as a chess-playing artificial intelligence system. Specifically, I introduce two main concepts: first, the Game Intelligence (GI) mechanism, which quantifies a player's intelligence in a game by considering not only the game's outcome but also the "mistakes" made during the game according to the reference machine's intelligence. Second, I define gamingproofness, a practical and computational concept of strategyproofness. To illustrate the GI mechanism, I apply it to an extensive dataset comprising over a billion chess moves, including over a million moves made by top 20 grandmasters in history. Notably, Magnus Carlsen emerges with the highest GI score among all world championship games included in the dataset. In machine-vs-machine games, the well-known chess engine Stockfish comes out on top.

Abstract PDF Upgrade to Chat

Summary

The paper presents a novel GI mechanism that assigns intelligence scores based on game outcomes and machine evaluations.
It leverages an extensive dataset, including over a billion chess moves, to empirically validate its intelligence measurement framework.
The research outlines gamingproofness, ensuring score reliability and paving the way for improved AI agent design in complex competitive settings.

An Overview of Intelligence Measurement in $n$ -Person Games with Partial Knowledge

The paper "Human and Machine Intelligence in $n$ -Person Games with Partial Knowledge: Theory and Computation" presents a meticulous formalization of intelligence measurement within the context of games, introducing novel mechanisms to evaluate strategic ability. The author's focus is on assigning a real number, an intelligence score, to each participant in a game, drawing insights from empirically observable data such as player actions, game outcomes, and player strength. This measurement framework further references the performance of oracle machines, such as chess engines, to benchmark player intelligence.

Core Concepts

The paper introduces two central constructs for intelligence evaluation: the Game Intelligence (GI) mechanism and gamingproofness. The GI mechanism quantitatively evaluates a player's intelligence by considering both the game's outcome and the mistakes measured against a reference machine's intelligence. Utilizing an extensive dataset, consisting of over a billion chess moves—including games by world-renowned grandmasters— the author illustrates the GI mechanism's efficacy. Remarkably, Magnus Carlsen achieves the highest GI score in the world championship games dataset, while the chess engine Stockfish tops the machine-vs-machine category.

Theoretical Foundations and Computational Results

From a theoretical standpoint, the paper confirms the existence of maximal intelligence plays within $n$ -person games with partial knowledge, ensuring that players can attain the highest intelligence score when guided by a particular mechanism. However, the practical application of this is limited, as players lack real-time access to machines during gameplay.

The research explores the notion of machine dynamic consistency, an aspect questioning whether a machine's evaluations should remain unchanged before and after a move. While it is deemed theoretically appealing, practical limitations mean this consistency is not always achievable. Furthermore, consistency within mechanisms—the alignment of attributed intelligence scores with both game outcomes and machine evaluations—is established as a necessary property for assessing player intelligence.

Additionally, the paper introduces gamingproofness, a concept paralleling strategyproofness, ensuring no manipulation of intelligence scores via deliberate suboptimal play. This ensures reliability in the assessment mechanism, contingent on the player's evaluation surpassing their estimation of machine intelligence.

Practical Implications and Future Directions

The implications of this research span both theoretical and practical domains. The GI mechanism offers a refined lens for understanding human intelligence within games, applicable not only to humans but also to the machines themselves. This dual applicability could inform future AI development and training, as AI systems could leverage GI scores for feedback beyond binary game outcomes.

As the research suggests, caution is recommended when applying the GI mechanism to sports where AI has not yet reached a competitive level, ensuring intelligence scores remain valid and reflective of actual player capabilities. Moreover, the GI framework could pave the way for novel AI agent architectures, using intelligence scores as metrics to optimize performance in complex environments.

The comprehensive nature of this paper—with its rigorous theoretical exploration and robust empirical analysis—lays a foundation for future research in AI and game theory, offering pathways toward more nuanced understandings of intelligence within competitive settings.