What Planning Problems Can A Relational Neural Network Solve?

Published 6 Dec 2023 in cs.LG, cs.AI, cs.NE, and stat.ML | (2312.03682v2)

Abstract: Goal-conditioned policies are generally understood to be "feed-forward" circuits, in the form of neural networks that map from the current state and the goal specification to the next action to take. However, under what circumstances such a policy can be learned and how efficient the policy will be are not well understood. In this paper, we present a circuit complexity analysis for relational neural networks (such as graph neural networks and transformers) representing policies for planning problems, by drawing connections with serialized goal regression search (S-GRS). We show that there are three general classes of planning problems, in terms of the growth of circuit width and depth as a function of the number of objects and planning horizon, providing constructive proofs. We also illustrate the utility of this analysis for designing neural networks for policy learning.

Abstract PDF Upgrade to Chat

Citations (6)

View on Semantic Scholar

Summary

The paper introduces a formal framework linking goal-conditioned policies to bounded regression width for efficient planning.
It demonstrates that RelNNs, including Graph Neural Networks and Transformers, can encode policies using finite-breadth circuits.
Empirical results confirm that Serialized Goal Regression Search enables polynomial-time solutions in structured planning scenarios.

Introduction

Goal-conditioned policies are intended to direct an agent's actions towards achieving a specific objective within a given state space. While neural networks in the form of policy circuits have shown promise in learning such policies, the question of their effectiveness across various planning problems presents a theoretical challenge. A key aspect of this challenge lies in determining when a polynomial-sized circuit can effectively manifest such a policy and understanding its complexity.

Planning and Learning

The paper focuses on classical planning problems characterized by an object-centric representation and sparse transition models. In these problems, one seeks an action sequence bringing the world from an initial state to a state that satisfies certain goals, with machine learning approaches aimed at finding policies that correctly identify the next action from any given state and goal. The paper introduces a formal definition of planning problems under consideration, grounding the subsequent analysis in a structured framework.

Search Complexity and Regression Width

The concept of Serialized Goal Regression Search (S-GRS) is introduced, highlighting an approach aimed at enhancing efficiency by serializing the preemptive goals. The fundamental measure, regression width, indicates the number of constraints one must track during search. Under the notion of optimally serializable rules, it is shown that if a problem presents a bounded regression width, it can be effectively solved in polynomial time using S-GRS.

Policy Realization Through Neural Networks

Relational Neural Networks (RelNNs), such as Graph Neural Networks and Transformers, can generalize to handle variable-sized inputs. The paper outlines methods to construct RelNNs that represent goal-conditioned policies, revealing that for problems with bounded regression width, finite-breadth networks can encode policies. In scenarios where a problem's regression rule selector can be efficiently approximated, compilation into even more compact RelNN circuits is possible.

Practical Implications and Results

Empirical observations further confirm the theoretical findings, with RelNNs demonstrating success in generalization and efficiency when tailored to the problem's regression width. The study sheds light on why RelNNs or similar policy circuits could perform well for specific domains and stresses the potential need for unbounded depths in networks when handling certain planning tasks, like those in Sokoban or Task and Motion Planning (TAMP) problems.

Conclusion

In summary, the paper innovatively ties the complexity of goal-conditioned policies to the structure of planning problems, offering a fresh understanding of policy circuit complexity. The insights provided could guide the design of neural networks for policy learning while predicting their capabilities and limitations across different planning domains.

Markdown Report Issue