Sequential Hidden-Variable Game Theory

Updated 21 January 2026

Sequential Hidden-Variable Games are strategic decision frameworks defined by hidden types, sequential actions, and dynamic belief updates.
The methodology employs extensive-form game models, backward-forward induction, and policy gradient learning to derive Bayesian and Perfect Bayesian equilibria.
Applications range from social deduction and adversarial games to quantum contextuality tests, offering versatile insights into uncertainty in sequential decision-making.

A sequential hidden-variable game is a class of strategic, temporally extended decision problem in which private random variables (hidden variables or types) assigned at the outset determine the private information of each agent, and subsequent play unfolds through sequential actions embedded in a rich information structure. These models lie at the core of modern Bayesian game theory, social deduction and adversarial sequential learning problems, and quantum contextuality analyses. The defining features of sequential hidden-variable games are: the presence of latent parameters (“hidden variables”) influencing play, sequential action in discrete or continuous time, imperfect and potentially asymmetric information, and a rigorous operational structure for belief updating and equilibrium reasoning.

1. Formal Definitions and Structural Framework

A canonical sequential hidden-variable game is specified by a tuple

$G = \left(N,\,\Omega,\,H,\,P,\,f_c,\,\{I_i\},\,\{A(h)\},\,\{u_i\}\right)$

where $N$ is the player set; $\Omega$ the finite set of possible private types ("hidden variables") $\omega=(\omega_1,\dots,\omega_N)$ assigned by Nature at the root; $f_c$ the probability law over $\Omega$ ; $H$ the set of histories (including sequential player actions); $P$ the player function specifying who moves at each history; $I_i$ the $i$ –player's partition of decision histories into information sets (structured by private type and public state); $A(h)$ the action set at history $h$ ; and $u_i$ the utility map at terminal histories. Nature's initial move $\omega\sim f_c$ generates all the informational heterogeneity. After this, actions are sequential and observation is limited to public state and one's own private type component $ω_i$ (Kovařík et al., 2021).

This structure encompasses:

Extensive-form games with chance only at the root (so-called "sequential Bayesian games").
Rich sequential social deduction models, as in Hidden Agenda (Kopparapu et al., 2022), where roles are sampled privately and long-run play involves spatial, temporal, and procedural information constraints.
Sequential quantum or contextuality games, in which preparation, measurement, and adaptive protocols lead to temporal hidden-variable dependencies (Vallée et al., 17 Sep 2025, Nomura et al., 14 Jan 2026).

2. Information and Belief Updating

The sequential nature of hidden-variable games imposes a belief-dependent, history-sensitive structure on players' strategic reasoning. Each player's information at stage $t$ typically comprises:

A (partially observed) public history, e.g., action records, vote tallies, observed signals.
Their own private type $ω_i$ (or a more general sequence of private observations).
Sometimes, derived or compressed statistics (such as sufficient statistics $S^i_t$ , or local posteriors over states).

Belief state evolution is central. In repeated Bayesian games (with static or dynamic hidden variables), players update common- and private-beliefs using recursive filtering: $\pi_{t+1}(x) = \frac{\pi_t(x)\prod_{i=1}^N\gamma_t^i(a_t^i|x^i)}{\sum_{x'}\pi_t(x')\prod_{i=1}^N\gamma_t^i(a_t^i|x'^i)}$ with prescription profiles $\gamma_t$ specifying type-contingent strategies (Vasal, 2018, Ouyang et al., 2023). In deep RL settings, recurrent policy networks (e.g. LSTM-based) are trained to align internal representations with Bayesian posterior inference on hidden roles, as in Hidden Agenda (Kopparapu et al., 2022).

In contextuality-testing scenarios, hidden-variable models incorporate explicit state-update kernels reflecting how sequential measurements change the underlying ontic (hidden) state, and belief updating must account for this induced memory across rounds (Vallée et al., 17 Sep 2025).

3. Representative Example Classes

Several paradigmatic instances clarify the structural diversity of sequential hidden-variable games:

Social deduction games: Hidden Agenda formalizes a two-team environment (Crewmates vs. Impostor) with roles $\theta_i\in\{C,I\}$ , complex state $s_t$ (spatial, procedural, voting), and transitions determined by sequential action and observation, under private role uncertainty (Kopparapu et al., 2022).
Adversarial last-success games: As in the adversarial Last-Success-Problem, two players act in turns over a sequence of Bernoulli trials $I_k$ , with victory depending on the outcome at the final trial and the turn-holding structure; dynamic programming yields an optimal recursive threshold policy (Ribas, 2018).
Quantum/contextuality scenarios: Vallée & Markham provide a sequential scenario composed of sequences of instruments ( $A_x, B_y$ ), each with probabilistic response and state-update, producing empirical data $P(a_1, ..., a_n | S)$ explicable (only if non-contextuality inequalities are unviolated) by deterministic, non-disturbing hidden-variable models (Vallée et al., 17 Sep 2025, Nomura et al., 14 Jan 2026).

4. Equilibrium Concepts and Computational Methodologies

The primary solution concepts are Bayesian Nash equilibrium (BNE), Perfect Bayesian Equilibrium (PBE), and their refinements. In dynamic settings, backward–forward induction is deployed to compute equilibrium profiles, leveraging common-belief recursions:

Structured PBE (SPBE): Policy profiles depending only on the common belief $\pi_t$ and private type $x^i$ , admitting linear complexity in time horizon $T$ (Vasal, 2018).
Sufficient-Information-Based BNE: Strategies and beliefs are compressible and sequentially computable in terms of sufficient private statistics and common beliefs; such restriction preserves best-response closedness and supports equilibrium construction via dynamic programming (Ouyang et al., 2023).

In reinforcement learning approaches to games like Hidden Agenda, gradient-based joint training yields empirically stable learned equilibria—distinct basins of attraction in policy space where tactical specialization (e.g., vote precision, group partnering, aggressive freezing) emerges and persists (Kopparapu et al., 2022).

Mean-field methods have been introduced in continuous-agent, sequential testing of hidden states, entailing coupled HJB and Fokker–Planck systems and fixed-point computation for population stopping time distributions (Campbell et al., 2024).

5. Signaling, Information Transmission, and Emergent Behavior

A hallmark of sequential hidden-variable games is strategic signaling: players' choices are shaped not only to maximize immediate reward, but also to manipulate the common belief or other agents’ inferences about hidden types. In SPBE, an agent will select actions that differentially reveal type information, thereby shaping future beliefs and continuation values (Vasal, 2018). In social deduction environments, agents may employ dynamic vote splitting and spatial “buddying” precisely to enhance collective identification of the adversarial type (Kopparapu et al., 2022).

Non-classical information transmission arises in sequential contextuality games, where state-update kernels $\Gamma_A$ allow an agent to encode memory into the ontic state, and empirical violation of non-disturbance or non-contextuality inequalities certifies the impossibility of classical hidden-variable explanations (Vallée et al., 17 Sep 2025).

6. Applications and Algorithmic Implications

Algorithmic advances in extensive-form game solving—such as public-state Counterfactual Regret Minimization (PS-CFR)—have thrived on the sequential hidden-variable structure, exploiting the efficient factorization of information sets and compressibility of belief states (Kovařík et al., 2021). Notably, statistical experiments testing local realism (e.g., Bell-CHSH protocols) can be recast as sequential hidden-variable games; game-theoretic probability constructs yield explicit operational betting strategies that jointly test empirical reproduction of quantum statistics and independence-from-setting, enforcing a refutation of local hidden-variable models if, and only if, both capital processes cannot be stabilized (Nomura et al., 14 Jan 2026).

The sequential hidden-variable framework underpins a broad spectrum: social deduction, adversarial sequential learning, quantum measurement scenarios, and large-scale imperfect-information games. Its core is a rigorous synthesis of sequential action, belief-dependent information flow, and strategic adaptation to uncertainty in hidden variables.