Efficient learning of goal-oriented push-grasping synergy in clutter

Published 9 Mar 2021 in cs.RO | (2103.05405v3)

Abstract: We focus on the task of goal-oriented grasping, in which a robot is supposed to grasp a pre-assigned goal object in clutter and needs some pre-grasp actions such as pushes to enable stable grasps. However, in this task, the robot gets positive rewards from environment only when successfully grasping the goal object. Besides, joint pushing and grasping elongates the action sequence, compounding the problem of reward delay. Thus, sample inefficiency remains a main challenge in this task. In this paper, a goal-conditioned hierarchical reinforcement learning formulation with high sample efficiency is proposed to learn a push-grasping policy for grasping a specific object in clutter. In our work, sample efficiency is improved by two means. First, we use a goal-conditioned mechanism by goal relabeling to enrich the replay buffer. Second, the pushing and grasping policies are respectively regarded as a generator and a discriminator and the pushing policy is trained with supervision of the grasping discriminator, thus densifying pushing rewards. To deal with the problem of distribution mismatch caused by different training settings of two policies, an alternating training stage is added to learn pushing and grasping in turn. A series of experiments carried out in simulation and real world indicate that our method can quickly learn effective pushing and grasping policies and outperforms existing methods in task completion rate and goal grasp success rate by less times of motion. Furthermore, we validate that our system can also adapt to goal-agnostic conditions with better performance. Note that our system can be transferred to the real world without any fine-tuning. Our code is available at https://github.com/xukechun/Efficient_goal-oriented_push-grasping_synergy.

Abstract PDF Upgrade to Chat

Citations (61)

View on Semantic Scholar

Summary

The paper presents a hierarchical reinforcement learning approach that synergizes pushing and grasping to improve robotic manipulation in cluttered settings.
It employs goal-conditioned relabeling and adversarial training to achieve task completion rates up to 97.8% and grasp success rates between 83.7% and 90%.
The methodology seamlessly adapts from simulation to real-world applications while reducing the number of required motion sequences compared to baseline methods.

Overview of "Efficient Learning of Goal-Oriented Push-Grasping Synergy in Clutter"

This paper presents a method for enhancing robot grasping capabilities through the integration of pushing and grasping maneuvers within cluttered environments. Specifically, the authors focus on a goal-oriented grasping task whereby a robot is tasked with identifying and extracting a specified target object amidst surrounding clutter, necessitating pre-grasp actions such as pushing to facilitate successful grasping. Addressing the challenges of reward delay and sample inefficiency common to joint pushing and grasping actions, the authors propose a novel hierarchical reinforcement learning (HRL) approach geared toward enhancing sample efficiency.

In their proposed framework, the authors utilize a goal-conditioned mechanism, incorporating goal relabeling to amplify the diversity of the replay buffer. This facet is supplemented by an adversarial training-like structure where pushing actions are informed by a pseudo-discriminative grasping policy, thereby yielding denser pushing rewards. This facilitates a synchronized development between pushing and grasping strategies, effectively mitigating the complications arising from heterogeneous policy training distributions through a structured alternating training phase.

The experimental results underscore the effectiveness of this approach, with simulations and real-world tests demonstrating superior task completion rates and grasp success rates, characterized by a reduced number of operations compared to existing methodologies. Notably, the policy's resilience is exemplified in scenarios transitioning from simulation to real-world applications without supplementary fine-tuning. Impressively, the policy can adapt to goal-agnostic environments by merely omitting target specifications, further attesting to its versatility and robustness.

Numerical Performance Metrics

Key findings reveal that the system trained with the proposed HRL framework achieves a task completion rate of approximately 97.8% in random cluttered scenarios while securing a grasp success rate of 83.7% to 90% depending on the implementation of alternating training. In addition to efficiency, the policy necessitated fewer motion sequences per successful task completion compared to baseline methodologies. The ability to dynamically learn and optimize maneuvers highlights the practical implications of the research for robotic manipulation tasks beyond laboratory settings.

Future Directions and Implications

This method's potential extends into multiple avenues of future research within robotic manipulation and reinforcement learning. There is substantive opportunity to explore the integration of more sophisticated perceptual models to enhance online adaptability further. Delving into transfer learning could provide insights into mitigating the sim-to-real gap, potentially broadening the applicability of such systems in dynamically changing environments. Moreover, extending the goal-conditioning mechanism to encompass multi-objective scenarios could harbor significant benefits for complex manipulation tasks necessitating simultaneous engagement with multiple objects.

This work stands out by effectively leveraging HRL to enhance push-grasp synergy, demonstrating marked improvements in action economy and success rates over alternatives. Its value is twofold: providing a clear methodological contribution to the practical implementation of robotic grasping within cluttered environments, and serving as a foundation for broader research into more advanced hierarchical and reinforcement learning applications in robotics.

Markdown Report Issue