Zero-shot sim-to-real transfer of tactile control policies for aggressive swing-up manipulation

Published 7 Jan 2021 in cs.RO, cs.AI, cs.SY, and eess.SY | (2101.02680v3)

Abstract: This paper aims to show that robots equipped with a vision-based tactile sensor can perform dynamic manipulation tasks without prior knowledge of all the physical attributes of the objects to be manipulated. For this purpose, a robotic system is presented that is able to swing up poles of different masses, radii and lengths, to an angle of 180 degrees, while relying solely on the feedback provided by the tactile sensor. This is achieved by developing a novel simulator that accurately models the interaction of a pole with the soft sensor. A feedback policy that is conditioned on a sensory observation history, and which has no prior knowledge of the physical features of the pole, is then learned in the aforementioned simulation. When evaluated on the physical system, the policy is able to swing up a wide range of poles that differ significantly in their physical attributes without further adaptation. To the authors' knowledge, this is the first work where a feedback policy from high-dimensional tactile observations is used to control the swing-up manipulation of poles in closed-loop.

Abstract PDF Upgrade to Chat

Citations (32)

View on Semantic Scholar

Summary

The paper presents a zero-shot sim-to-real transfer method for tactile control policies enabling robots to perform dynamic manipulation using only tactile feedback.
Researchers developed a novel tactile simulator and used deep reinforcement learning with dynamics randomization to train high-dimensional tactile policies in simulation.
The learned policy was successfully deployed on a real robot for aggressive pole swing-up without any real-world data collection or tuning, demonstrating practical applicability.

Zero-shot Sim-to-Real Transfer of Tactile Control Policies for Dynamic Manipulation

The paper in discussion presents a significant contribution to the field of robotic manipulation, focusing on the transferability of control policies learned in simulated environments to real-world deployments using tactile feedback. The core objective is to enable robots equipped with vision-based tactile sensors to perform dynamic manipulation tasks without pre-programmed knowledge about the physical attributes of the objects involved. Particularly, the task explored involves the swing-up manipulation of poles with varying physical characteristics.

Key Contributions

The study introduces a novel approach where a feedback control policy, conditioned solely on tactile information, is formulated in a simulated environment and then deployed directly to a physical robot. The highlights of this work are multifaceted:

Novel Tactile Simulator Development: The researchers developed a soft sensor simulator leveraging the finite element method combined with semi-implicit time-stepping schemes. This simulator effectively models the interaction of a cylindrical pole with the sensor, incorporating complex contact dynamics that run efficiently at high frequencies.
High-Dimensional Feedback Control: A convolutional neural network, based on ShuffleNet V2, infers the high-dimensional force distribution from sensor images. This step is crucial in providing the necessary fidelity to model tactile interactions realistically.
Reinforcement Learning for Control Policies: The paper employs deep reinforcement learning techniques to learn control policies in simulated environments. The approach utilizes dynamics randomization and privileged learning to address the sim-to-real transfer problem, effectively reducing the reality gap without requiring additional real-world data gathering.
Zero-shot Policy Deployment: The feedback control policy was applied to a real-world robotic system capable of dynamically swinging poles with varying physical parameters to an upright position. This was achieved without needing prior adaptation or exploration on the physical system, showcasing the practical viability of zero-shot sim-to-real policy transfers.

Implications and Future Directions

The implications of this work are substantial, both theoretically and practically. Theoretically, it demonstrates the potential for high-dimensional, sensory-based reinforcement learning approaches to solve complex manipulation tasks that require dynamic adaptability. Practically, the zero-shot capability alleviates the time and resource expenditure typically involved in collecting physical data and performing trial-and-error learning on hardware systems.

Future developments could explore extending these methods to accommodate non-planar tasks or incorporate more sophisticated forms of sensor feedback. Additionally, while the paper demonstrates the applicability of the learned policy on a pivotal manipulation task, scalability to more diverse manipulation scenarios remains a fertile area for exploration. Possible expansions include integrating end-to-end learning architectures capable of deriving features automatically and extending the learned policies to multi-fingered robotic hands operating in more complex environments.

The study opens new avenues for integrating tactile sensation as a fundamental feedback modality in robotic systems, broadening the operational horizon within which robots can perform adaptable, human-like manipulation tasks. As the field progresses, the methodologies and findings of this research could be instrumental in pushing the boundaries of what is achievable with sim-to-real transfer techniques in robotic manipulation.