Learning Extrinsic Dexterity with Parameterized Manipulation Primitives

Published 26 Oct 2023 in cs.RO and cs.LG | (2310.17785v3)

Abstract: Many practically relevant robot grasping problems feature a target object for which all grasps are occluded, e.g., by the environment. Single-shot grasp planning invariably fails in such scenarios. Instead, it is necessary to first manipulate the object into a configuration that affords a grasp. We solve this problem by learning a sequence of actions that utilize the environment to change the object's pose. Concretely, we employ hierarchical reinforcement learning to combine a sequence of learned parameterized manipulation primitives. By learning the low-level manipulation policies, our approach can control the object's state through exploiting interactions between the object, the gripper, and the environment. Designing such a complex behavior analytically would be infeasible under uncontrolled conditions, as an analytic approach requires accurate physical modeling of the interaction and contact dynamics. In contrast, we learn a hierarchical policy model that operates directly on depth perception data, without the need for object detection, pose estimation, or manual design of controllers. We evaluate our approach on picking box-shaped objects of various weight, shape, and friction properties from a constrained table-top workspace. Our method transfers to a real robot and is able to successfully complete the object picking task in 98\% of experimental trials. Supplementary information and videos can be found at https://shihminyang.github.io/ED-PMP/.

Abstract PDF Upgrade to Chat

Citations (2)

View on Semantic Scholar

Summary

The paper proposes a hierarchical reinforcement learning approach that uses parameterized manipulation primitives to directly manipulate object poses without explicit detection.
It leverages a Fully Convolutional Network for depth-to-height map conversion coupled with a Deep Q-Network for low-level control of primitives.
Real-world tests on a Franka Emika Panda robot yield up to 98% success, outperforming traditional methods in high-dimensional action spaces.

Learning Extrinsic Dexterity with Parameterized Manipulation Primitives

The paper "Learning Extrinsic Dexterity with Parameterized Manipulation Primitives" by Shih-Min Yang, Martin Magnusson, Johannes A. Stork, and Todor Stoyanov addresses a critical challenge in robotic manipulation: the inability of single-shot grasp planning to handle objects with all feasible grasps occluded. The authors propose a hierarchical reinforcement learning (HRL) approach to manipulate an object's pose using sequences of parameterized manipulation primitives. This novel method directly operates on depth perception data and doesn't rely on object detection or pose estimation, making it viable under uncontrolled conditions.

Methodology

The core of the methodology is the hierarchical decomposition of tasks into high-level policies selecting parameterized primitives and low-level policies executing the selected primitive actions. This approach leverages HRL to efficiently explore the state-action space without requiring manually designed primitive controllers. The primary primitives include:

Push Primitive: Achieves in-plane object movement.
Flip Primitive: Utilizes environmental interactions to pivot an object.
Grasp Primitive: Executes the grasping action on objects in favorable configurations.

A Fully Convolutional Network (FCN) is employed at the high level, converting the depth images into height maps for policy decision-making. The low-level policy, particularly for the flip primitive, is trained via a Deep Q-Network (DQN) that considers the end-effector pose and contact forces.

Training and Evaluation

The training follows a curriculum learning strategy, initially focusing on low-level primitives before advancing to high-level decision-making. This staged approach mitigates the complexity of learning intertwined high- and low-level tasks simultaneously. Domain randomization techniques ensure robustness against sim-to-real transfer, demonstrated by successful real-world experiments using a Franka Emika Panda robot.

Results

The proposed method (ED-PMP) achieves impressive performance metrics:

Simulation Results: The paper reports a task completion rate reaching 80% within 800 training episodes. The method outperforms both SAC and Rainbow DQN, which struggle due to high-dimensional action spaces.
Real-World Results: ED-PMP attains a 98% success rate in varied scenarios, demonstrating effective object reconfiguration irrespective of initial placement. In comparison, the baseline method by Zhou and Held achieves substantially lower success rates, especially when the object is not positioned close to the wall.

Implications and Future Directions

This research introduces a robust HRL framework that can potentially generalize to broader manipulation tasks requiring complex sequences of actions. By demonstrating zero-shot transfer to real-world scenarios, it sets a benchmark for practical applications in autonomous robotic systems. Future research could explore automating the design of reward functions to further simplify the training of multiple parameterized primitives, expanding the applicability of the proposed method to more intricate manipulation tasks.

The presented work emphasizes the combination of learned primitives in hierarchical settings, enabling robots to solve tasks that are analytically infeasible due to complex physical interactions. This opens pathways to more adaptable and versatile robotic systems capable of performing a wide range of sophisticated tasks in real-world environments.

Markdown Report Issue