Multi-Agent Epistemic Planning & Strategic Reasoning

Updated 14 February 2026

Multi-Agent Epistemic Planning and Strategic Reasoning is a field that integrates formal epistemic logics, dynamic action models, and strategic decision-making for multi-agent environments.
It employs frameworks such as DEL-based planning, bounded modal depth, and learned heuristics to manage complex nested beliefs and ensure efficient state updates.
The approach enables practical applications in security, robotics, economics, and human–AI systems by supporting coordination, deception, and coalition-based strategies.

Multi-agent epistemic planning and strategic reasoning jointly address the problem of synthesizing action sequences or policies in complex, multi-agent environments, where agents must reason not only about the physical world but also about their own and others’ knowledge, beliefs, observation capabilities, biases, and strategic objectives. This topic is at the intersection of automated planning, dynamic epistemic logic (DEL), knowledge representation, and game theory. The resulting frameworks underpin coordination, deception, secrecy, coalition formation, coalition-proof planning, and robust strategic behavior across diverse domains such as security, communication, robotics, economics, and multi-agent human–AI systems.

1. Formal Foundations: Epistemic Logics and Planning Problem Specification

The formal landscape of multi-agent epistemic planning is defined primarily by modal logics for representing knowledge and belief, most often leveraging Kripke structures with agent-indexed accessibility relations (Bolander, 2017, Fabiano, 2021). Propositional atoms represent the world state, while modal operators $K_i$ (knowledge), $B_i$ (belief), and their group versions $E_G$ , $C_G$ (everyone/common knowledge), are used for encoding higher-order informational states.

A Kripke structure $M=(S, \{R_i\}_{i\in \mathcal{A}}, V)$ consists of a (finite) set of worlds $S$ , agent-indexed relations $R_i$ , and a valuation $V:S\to 2^P$ for propositional atoms. Satisfaction for modal formulae is recursively defined, e.g., $M,s \models K_i \varphi$ iff $\forall t: s R_i t \implies M,t \models \varphi$ .

Planning problems are specified as tuples $(S_0, G, \mathit{Acts})$ , with an initial epistemic state, a goal formula (often involving nested epistemic operators), and a set of epistemic actions (Fabiano, 2021, Fabiano, 2019). Actions are standardly represented via event models in DEL, detailed below.

Recent frameworks (e.g., the Predictive Justified Perspective Model) extend state signatures to incorporate processual or dynamic variables with potentially infinite domains and predictive components for handling non-static, evolving environments (Li et al., 2024).

2. Action Models and State Evolution in Dynamic Epistemic Logic

Action models in Dynamic Epistemic Logic (DEL) generalize classical planning operators to encompass both ontic (physical) and epistemic (information-changing) effects. An event model $E=(E, \{S_i\}_{i\in\mathcal{A}}, \mathit{pre}, \mathit{post})$ comprises a set of events, agents’ indistinguishability relations, precondition and postcondition maps. Actions specify which agents observe (and thus update from) which events, enabling modeling of public, private, and semi-private announcements, as well as mis- and dis-information (Bolander, 2017, Fabiano, 2021, Fabiano, 2019).

The product update operation combines an epistemic state $(M, W_d)$ and an event model $(E, E_d)$ to yield a new epistemic state, reflecting both world evolution and the informational impact on agents’ knowledge/belief relations. This operation is fundamental to all DEL-based epistemic planning algorithms (Bolander, 2017, Fabiano, 2021).

Alternative frameworks (e.g., Functional STRIPS, possibilities-based representations in ASP, and Planning with Perspectives models) offer further abstractions: they move epistemic reasoning to black-box external procedures or memory-based justified perspectives, supporting lazy evaluation, rich observation semantics, and polynomial-time term evaluation for large classes of formulas (Hu et al., 2019, Burigana et al., 2020, Li et al., 2024, Hu et al., 2024).

Recent models introduce predictive retrieval functions (PJP) that allow agents to synthesize beliefs about both current and future values by combining all past observations with processual projections, overcoming the classic assumption of static environment knowledge (Li et al., 2024).

3. Expressiveness, Complexity, and Decidability

Epistemic planning with unrestricted DEL models and arbitrary modal nesting is, in general, undecidable. Restricting to bounded modal depth or actionable syntactic fragments (e.g., PEKB representations) recovers decidability and manageable complexity (typically PSPACE- to EXPTIME-complete, depending on nesting and the types of actions) (Muise et al., 2021, Burigana et al., 2023). The commutativity axiom ( $K_i K_j \varphi \rightarrow K_j K_i \varphi$ ) semantically collapses higher-order compositions, yielding finitary characterizations of common knowledge and enabling fully decidable fragments for multi-agent planning (Burigana et al., 2023).

Alternative approaches to complexity control include:

Compilation to classical planning: translating bounded-depth epistemic problems into propositional classical planning using auxiliary fluents and conditional effects (Muise et al., 2021).
Polynomial-time evaluation in perspectives-based models (PWP, JP, GJP): evaluating formulas via state-sequence–based, memory-aware perspectives (Li et al., 2024, Hu et al., 2024).
On-the-fly, black-box epistemic reasoning in functional STRIPS or ASP: external calls implement all epistemic checks, keeping world-state representations compact and epistemic evaluation independent of agent count or modal depth (Hu et al., 2019, Burigana et al., 2020).

Empirical results consistently show that non-Kripke explicit representations, together with lazy evaluation or externalized reasoning, scale significantly better than classical state-space–based approaches for realistic, strategically relevant multi-agent domains.

4. Strategic Reasoning and Implicit/Explicit Coordination

Epistemic planning directly supports sophisticated strategic reasoning: agents can achieve goals involving the manipulation of others’ beliefs, misinformation, secrecy, or coalition-based coordination (Engesser et al., 2017, Fabiano, 2019, Fabiano, 2021, Trencsenyi et al., 11 Feb 2025). Higher-order beliefs (nested modalities) and common knowledge operators enable explicit modeling and synthesis of plans requiring knowledge manipulation, deception, or coordinated action.

Notable frameworks and results include:

Implicit coordination via perspective shift: Sequential or conditional plans guarantee that each agent, acting from her own perspective, can verify at execution time that subsequent plan steps remain achievable and strategically valid (i.e., backward induction in epistemic game trees) (Engesser et al., 2017).
Winning strategies involving belief manipulation: Real-world case studies are modeled with event models that capture lies, private communications, partial observability, and goal formulas that demand knowledge or belief not only about the world, but about others’ knowledge (e.g., $B_i B_j p$ ) (Fabiano, 2019, Muise et al., 2021).
Hypergame and multi-agent strategic logics: Higher-level frameworks (EMASL) treat group strategies as first-class citizens, enabling reasoning about coalitional plans, epistemic-social choice, and deep interactions between knowledge and strategic ability (Eijck, 2013, Trencsenyi et al., 11 Feb 2025).
LLM-driven and simulation-based approaches: These test and evaluate artificial agents’ ability to recursively reason about others’ beliefs within complex strategic interactions, comparing to human data and introducing semantic measures of reasoning depth (Trencsenyi et al., 11 Feb 2025).

Planning with group beliefs and distributed justified perspectives (GJP) enables strategic coalition formation, false-belief scenarios, and common belief as distinct from common knowledge, supporting robust coordination and deception even in cases where Kripke-model-based approaches cannot feasibly represent the requisite informational state (Hu et al., 2024).

5. Implementations, Heuristic Guidance, and Scalability

Early epistemic planners encountered scalability challenges due to the exponential blow-up inherent in explicit Kripke-structure updates and nested relations. Recent developments address these key bottlenecks:

Portfolio and meta-cognitive modules: Adaptive solvers select among multiple state representations (Kripke, “possibilities”, perspectives) and heuristics (trivial sub-goal count, refined epistemic planning graphs, domain-specific classical planning relaxations), often via portfolio-based configurations (Fabiano, 2021, Muise et al., 2021).
Learned GNN heuristics: Graph Neural Networks trained on state-product graphs (Kripke structures as graphs with world-ID and agent-labeled edges) learn to predict distances-to-goal and dramatically reduce search node expansions in sequential and cross-domain settings (Briglia et al., 18 Aug 2025).
Declarative and externalized strategies: ASP-based planners and functional STRIPS frameworks allow for rapid extension with new modalities, support custom external epistemic evaluators, and keep the explicit world state minimal (Hu et al., 2019, Burigana et al., 2020).
Polynomial-time perspective-based updates: Justified perspective models support the efficient composition of nested beliefs, justification, and common belief over structured state sequences rather than full model expansions (Li et al., 2024, Hu et al., 2024).

Empirical evaluations in benchmark domains—ranging from security games and “grapevine” gossip scenarios to economics (trust games), collaborative communication, vision-based planning (BBL), and card-hand privacy puzzles—demonstrate that these methods can generate intricate, deeply nested strategic plans with orders-of-magnitude improvements in practical runtime and scalability (Fabiano, 2019, Muise et al., 2021, Briglia et al., 18 Aug 2025, Li et al., 2024).

6. Extensions, Open Challenges, and Future Directions

Open challenges include developing symbolic and parallel state-update mechanisms for Kripke models, integrating uncertainty quantification for epistemic predictions, incorporating dynamically learned or coupled processual variables, realizing heuristics informed by learned patterns or human strategic behavior, and achieving robust integration with classical planning toolchains for limited-nesting domains (Fabiano, 2021, Li et al., 2024, Briglia et al., 18 Aug 2025).

Approaches that combine symbolic epistemic-model-based planning with LLM-enhanced or simulation-based recursive reasoners promise a bridge between formal multi-agent strategy logic and cognitive modeling of human or artificial agents’ reasoning in social, economic, and adversarial games (Trencsenyi et al., 11 Feb 2025). The introduction of semantic measures of articulated reasoning depth (such as explicit belief-kappa versus k-level) provides new frameworks for measuring, evaluating, and potentially guiding artificial and human-like strategic reasoning depth in multi-agent systems.

Continued advances in polynomial or sub-exponential algorithms for group belief and perspective models (e.g., GJP, PJP) offer the prospect of handling indefinitely nested, high-cardinality agent scenarios where foundational DEL approaches are infeasible (Hu et al., 2024, Li et al., 2024). Robust handling of noisy or partial observations, adversarial and incentivized behavior, and integration with epistemic-temporal or hybrid logics remain active research directions.

7. Summary Table: Methodologies in Multi-Agent Epistemic Planning

Framework / Approach	Key Features	Strategic Reasoning Support
DEL-based planning (Bolander, 2017, Fabiano, 2021)	Kripke/event models, product update, explicit nested knowledge	Rich; supports deception, coordination, common knowledge
PEKB/Compilation (Muise et al., 2021)	Bounded-depth, propositional compilation, classical planner	Nested belief, fast planning
Functional STRIPS (Hu et al., 2019)	Lazy, external epistemic checks, see-based knowledge	Domain-flexible, rapid adaptation
Perspective/JP/GJP (Li et al., 2024, Hu et al., 2024)	Memory-based justified beliefs, group/common/dist belief, ternary semantics	False-belief, group strategies, polynomial update
ASP/“Possibilities” (Burigana et al., 2020)	Declarative representation, multi-shot solve, generic entailment	Nested and group beliefs, declarative extension
GNN Heuristics (Briglia et al., 18 Aug 2025)	Learned, graph-based heuristics for Kripke models	Scalable, cross-domain heuristic learning
Hypergame/LLM-based (Trencsenyi et al., 11 Feb 2025, Eijck, 2013)	Recursive agent models, LLM recursive reasoning, semantic depth	Human-level and superhuman recursive reasoning

All approaches target the synthesis of epistemic plans and strategies in environments requiring non-trivial reasoning about the knowledge and beliefs of multiple agents, with varying emphasis on scalability, expressivity, and theoretical guarantees.

References:

(Bolander, 2017): A Gentle Introduction to Epistemic Planning: The DEL Approach (Fabiano, 2021): Comprehensive Multi-Agent Epistemic Planning (Hu et al., 2019): What you get is what you see: Decomposing Epistemic Planning using Functional STRIPS (Fabiano, 2019): Design of a Solver for Multi-Agent Epistemic Planning (Burigana et al., 2020): Modelling Multi-Agent Epistemic Planning in ASP (Muise et al., 2021): Efficient Multi-agent Epistemic Planning: Teaching Planners About Nested Belief (Burigana et al., 2023): A Semantic Approach to Decidability in Epistemic Planning (Extended Version) (Li et al., 2024): Beyond Static Assumptions: the Predictive Justified Perspective Model for Epistemic Planning (Hu et al., 2024): Where Common Knowledge Cannot Be Formed, Common Belief Can -- Planning with Multi-Agent Belief Using Group Justified Perspectives (Eijck, 2013): PDL as a Multi-Agent Strategy Logic (Trencsenyi et al., 11 Feb 2025): Approximating Human Strategic Reasoning with LLM-Enhanced Recursive Reasoners Leveraging Multi-agent Hypergames (Briglia et al., 18 Aug 2025): Scaling Multi-Agent Epistemic Planning through GNN-Derived Heuristics (Engesser et al., 2017): Cooperative Epistemic Multi-Agent Planning for Implicit Coordination