Generalized Planning for the Abstraction and Reasoning Corpus

Published 15 Jan 2024 in cs.AI | (2401.07426v1)

Abstract: The Abstraction and Reasoning Corpus (ARC) is a general artificial intelligence benchmark that poses difficulties for pure machine learning methods due to its requirement for fluid intelligence with a focus on reasoning and abstraction. In this work, we introduce an ARC solver, Generalized Planning for Abstract Reasoning (GPAR). It casts an ARC problem as a generalized planning (GP) problem, where a solution is formalized as a planning program with pointers. We express each ARC problem using the standard Planning Domain Definition Language (PDDL) coupled with external functions representing object-centric abstractions. We show how to scale up GP solvers via domain knowledge specific to ARC in the form of restrictions over the actions model, predicates, arguments and valid structure of planning programs. Our experiments demonstrate that GPAR outperforms the state-of-the-art solvers on the object-centric tasks of the ARC, showing the effectiveness of GP and the expressiveness of PDDL to model ARC problems. The challenges provided by the ARC benchmark motivate research to advance existing GP solvers and understand new relations with other planning computational models. Code is available at github.com/you68681/GPAR.

Abstract PDF HTML Upgrade to Chat

References (21)

Citations (6)

View on Semantic Scholar

Summary

The paper introduces GPAR, demonstrating a novel approach for encoding ARC tasks as generalized planning problems using PDDL.
It employs object-centric abstractions, a specialized DSL, and advanced pruning techniques to efficiently model and solve abstract reasoning challenges.
Experimental results reveal that GPAR outperforms state-of-the-art solvers in ARC tasks by reducing solution complexity and enhancing generalization.

Generalized Planning for the Abstraction and Reasoning Corpus

Introduction

The paper "Generalized Planning for the Abstraction and Reasoning Corpus" (2401.07426) addresses the challenges posed by the Abstraction and Reasoning Corpus (ARC), a benchmark designed to evaluate machine intelligence in areas requiring reasoning and abstraction. Traditional machine learning approaches face significant hurdles with ARC due to its reliance on fluid intelligence. The authors introduce Generalized Planning for Abstract Reasoning (GPAR), a novel ARC solver that leverages generalized planning (GP), employing the Planning Domain Definition Language (PDDL) to express ARC tasks. The study showcases GPAR’s ability to outperform current state-of-the-art solvers by utilizing domain knowledge and object-centric abstractions specific to ARC tasks.

Figure 1: Three example tasks from the ARC, illustrating the input-output nature of tasks and the challenge of generating outputs for new instances.

Methodology

Central to this research is the encoding of each ARC task as a generalized planning problem, formalized as a planning program with pointers. GPAR employs PDDL to represent the problem domain and uses external functions to model object-centric abstractions. This blend of declarative and imperative modeling enhances expressivity and supports concise representation of transition functions.

The paper outlines a set of abstractions, such as the 4-connected and 8-connected component approaches, which help in defining objects in ARC tasks. The paper illustrates that different abstractions, like multi-color nodes or shape-based nodes, can significantly influence the interpretation and solution of tasks.

Figure 2: A PDDL example illustrating the representation of a fragment of an ARC task.

GPAR further incorporates advanced pruning techniques to eliminate irrelevant actions, optimizing the search space. Three main constraints—position, color, and size stability—are used to identify and remove unnecessary actions from the action model, thus refining the solution process.

Domain-Specific Language

The authors developed a domain-specific language (DSL) tailored for ARC, capable of representing actions and predicates relevant to the visual reasoning tasks. The DSL supports low-level actions, like color updates, and high-level operations, including node movements and spatial transformations. This DSL ensures that the generated programs remain tractable and focused on the critical elements of each ARC task.

The integration of PDDL with external functions enables effective modeling of complex logical conditions and effects necessary for solving ARC tasks. The paper emphasizes the importance of this integration in achieving state-of-the-art results.

Experimental Results

Experiments conducted over 160 object-centric ARC tasks demonstrate the superiority of GPAR. The solver outperforms both the Kaggle competition’s first-place model and ARGA, notably in the recoloring tasks, where GPAR exhibits a significant advantage due to its expressive DSL and robust predicate representation. GPAR not only matches but also enhances the generalization capabilities, evidenced by its minimal performance gap between training and test instances.

Figure 3: The Venn diagram of the number of solved tasks by GPAR, Kaggle First Place, and ARGA in testing.

Further analysis shows that GPAR frequently requires fewer program lines and lower novelty thresholds to solve tasks, indicating its efficiency and suitability for generalized planning scenarios. The application of novelty pruning and heuristic guidance within GPAR contributes to its ability to explore vast search spaces effectively.

Implications and Future Work

GPAR's approach to solving ARC tasks has several implications:

Theoretical Contributions: The integration of GP with PDDL and external functions highlights a robust framework for tackling abstract reasoning problems. It introduces a scalable method for complex task modeling, likely to influence future research in AI reasoning tasks.
Practical Applications: The efficacy demonstrated by GPAR paves the way for its application in domains requiring high-level abstraction and reasoning, potentially impacting fields like robotics and cognitive task modeling.

The research opens several avenues for future exploration, including refining abstraction selection techniques and developing new heuristics to further scale the search process. Exploring connections with alternative planning models could enhance the utility of GPAR in broader visual and logical reasoning contexts.

Conclusion

The paper successfully demonstrates that generalized planning, supported by an expressive DSL, can address the complexities of the ARC benchmark. GPAR's performance exemplifies the potential of combining planning formalities with domain-specific insights, setting a new standard for evaluating AI capabilities in reasoning and abstraction tasks. Continued advancements in this framework promise further breakthroughs in artificial intelligence reasoning and cognitive modeling.

Markdown Report Issue