Papers
Topics
Authors
Recent
Search
2000 character limit reached

Generalized Planning for the Abstraction and Reasoning Corpus

Published 15 Jan 2024 in cs.AI | (2401.07426v1)

Abstract: The Abstraction and Reasoning Corpus (ARC) is a general artificial intelligence benchmark that poses difficulties for pure machine learning methods due to its requirement for fluid intelligence with a focus on reasoning and abstraction. In this work, we introduce an ARC solver, Generalized Planning for Abstract Reasoning (GPAR). It casts an ARC problem as a generalized planning (GP) problem, where a solution is formalized as a planning program with pointers. We express each ARC problem using the standard Planning Domain Definition Language (PDDL) coupled with external functions representing object-centric abstractions. We show how to scale up GP solvers via domain knowledge specific to ARC in the form of restrictions over the actions model, predicates, arguments and valid structure of planning programs. Our experiments demonstrate that GPAR outperforms the state-of-the-art solvers on the object-centric tasks of the ARC, showing the effectiveness of GP and the expressiveness of PDDL to model ARC problems. The challenges provided by the ARC benchmark motivate research to advance existing GP solvers and understand new relations with other planning computational models. Code is available at github.com/you68681/GPAR.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Communicating Natural Programs to Humans and Machines. In Proceedings of the 36th Advances in Neural Information Processing Systems, NeurIPS, 3731–3743.
  2. Measuring Abstract Reasoning in Neural Networks. In Proceedings of the 37th International conference on machine learning, ICML, 511–520.
  3. Chollet, F. 2019. On the Measure of Intelligence. arXiv preprint arXiv:1911.01547.
  4. Semantic Attachments for Domain-Independent Planning Systems. In Proceedings of the 19th International conference on machine learning, ICAPS, 114–121.
  5. Purely Declarative Action Descriptions are Overrated: Classical Planning with Simulators. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, IJCAI, 4294–301.
  6. CHAPTER 6 - Nonmonotonic Reasoning. In Logical Foundations of Artificial Intelligence, 115–159. Morgan Kaufmann.
  7. An Introduction to the Planning Domain Definition Language. Synthesis Lectures on Artificial Intelligence and Machine Learning, 13(2): 1–187.
  8. Generalized Planning: Synthesizing Plans that Work for Multiple Environments. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence, IJCAI, 918–923.
  9. A Review of Generalized Planning. The Knowledge Engineering Review, 34: e5.
  10. Fast and Flexible: Human Program Induction in Abstract Reasoning Tasks. arXiv preprint arXiv:2103.05823.
  11. Novelty and Lifted Helpful Actions in Generalized Planning. In Proceedings of the 16th International Symposium on Combinatorial Search, SoCS, 148–152.
  12. Levesque, H. J. 1986. Knowledge Representation and Reasoning. Annual Review of Computer Science, 1(1): 255–287.
  13. A Review of Emerging Research Directions in Abstract Visual Reasoning. Information Fusion, 91: 713–736.
  14. The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain. arXiv preprint arXiv:2305.07141.
  15. GPS, A Program that Simulates Human Thought. In Computers and Thought, 279–293. McGraw-Hill.
  16. Scaling-up Generalized Planning as Heuristic Search with Landmarks. In Proceedings of the 15th International Symposium on Combinatorial Search, SoCS, 171–179.
  17. Computing Programs for Generalized Planning Using a Classical Planner. Artificial Intelligence, 272: 52–85.
  18. Core Knowledge. Developmental Science, 10(1): 89–96.
  19. Learning Generalized Plans Using Abstract Counting. In Proceedings of the 23rd AAAI Conference on Artificial Intelligence, AAAI, 991–997.
  20. top quarks. 2020. ARC-solution. https://github.com/top-quarks/ARC-solution. Accessed: 2023-06-01.
  21. Graphs, Constraints, and Search for the Abstraction and Reasoning Corpus. In Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI, 4115–4122.
Citations (6)

Summary

  • The paper introduces GPAR, demonstrating a novel approach for encoding ARC tasks as generalized planning problems using PDDL.
  • It employs object-centric abstractions, a specialized DSL, and advanced pruning techniques to efficiently model and solve abstract reasoning challenges.
  • Experimental results reveal that GPAR outperforms state-of-the-art solvers in ARC tasks by reducing solution complexity and enhancing generalization.

Generalized Planning for the Abstraction and Reasoning Corpus

Introduction

The paper "Generalized Planning for the Abstraction and Reasoning Corpus" (2401.07426) addresses the challenges posed by the Abstraction and Reasoning Corpus (ARC), a benchmark designed to evaluate machine intelligence in areas requiring reasoning and abstraction. Traditional machine learning approaches face significant hurdles with ARC due to its reliance on fluid intelligence. The authors introduce Generalized Planning for Abstract Reasoning (GPAR), a novel ARC solver that leverages generalized planning (GP), employing the Planning Domain Definition Language (PDDL) to express ARC tasks. The study showcases GPAR’s ability to outperform current state-of-the-art solvers by utilizing domain knowledge and object-centric abstractions specific to ARC tasks. Figure 1

Figure 1: Three example tasks from the ARC, illustrating the input-output nature of tasks and the challenge of generating outputs for new instances.

Methodology

Central to this research is the encoding of each ARC task as a generalized planning problem, formalized as a planning program with pointers. GPAR employs PDDL to represent the problem domain and uses external functions to model object-centric abstractions. This blend of declarative and imperative modeling enhances expressivity and supports concise representation of transition functions.

The paper outlines a set of abstractions, such as the 4-connected and 8-connected component approaches, which help in defining objects in ARC tasks. The paper illustrates that different abstractions, like multi-color nodes or shape-based nodes, can significantly influence the interpretation and solution of tasks. Figure 2

Figure 2: A PDDL example illustrating the representation of a fragment of an ARC task.

GPAR further incorporates advanced pruning techniques to eliminate irrelevant actions, optimizing the search space. Three main constraints—position, color, and size stability—are used to identify and remove unnecessary actions from the action model, thus refining the solution process.

Domain-Specific Language

The authors developed a domain-specific language (DSL) tailored for ARC, capable of representing actions and predicates relevant to the visual reasoning tasks. The DSL supports low-level actions, like color updates, and high-level operations, including node movements and spatial transformations. This DSL ensures that the generated programs remain tractable and focused on the critical elements of each ARC task.

The integration of PDDL with external functions enables effective modeling of complex logical conditions and effects necessary for solving ARC tasks. The paper emphasizes the importance of this integration in achieving state-of-the-art results.

Experimental Results

Experiments conducted over 160 object-centric ARC tasks demonstrate the superiority of GPAR. The solver outperforms both the Kaggle competition’s first-place model and ARGA, notably in the recoloring tasks, where GPAR exhibits a significant advantage due to its expressive DSL and robust predicate representation. GPAR not only matches but also enhances the generalization capabilities, evidenced by its minimal performance gap between training and test instances. Figure 3

Figure 3: The Venn diagram of the number of solved tasks by GPAR, Kaggle First Place, and ARGA in testing.

Further analysis shows that GPAR frequently requires fewer program lines and lower novelty thresholds to solve tasks, indicating its efficiency and suitability for generalized planning scenarios. The application of novelty pruning and heuristic guidance within GPAR contributes to its ability to explore vast search spaces effectively.

Implications and Future Work

GPAR's approach to solving ARC tasks has several implications:

  • Theoretical Contributions: The integration of GP with PDDL and external functions highlights a robust framework for tackling abstract reasoning problems. It introduces a scalable method for complex task modeling, likely to influence future research in AI reasoning tasks.
  • Practical Applications: The efficacy demonstrated by GPAR paves the way for its application in domains requiring high-level abstraction and reasoning, potentially impacting fields like robotics and cognitive task modeling.

The research opens several avenues for future exploration, including refining abstraction selection techniques and developing new heuristics to further scale the search process. Exploring connections with alternative planning models could enhance the utility of GPAR in broader visual and logical reasoning contexts.

Conclusion

The paper successfully demonstrates that generalized planning, supported by an expressive DSL, can address the complexities of the ARC benchmark. GPAR's performance exemplifies the potential of combining planning formalities with domain-specific insights, setting a new standard for evaluating AI capabilities in reasoning and abstraction tasks. Continued advancements in this framework promise further breakthroughs in artificial intelligence reasoning and cognitive modeling.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

GitHub

  1. GitHub - you68681/GPAR (21 stars)  

Tweets

Sign up for free to view the 1 tweet with 2 likes about this paper.