Learning Agile Robotic Locomotion Skills by Imitating Animals

Published 2 Apr 2020 in cs.RO and cs.LG | (2004.00784v3)

Abstract: Reproducing the diverse and agile locomotion skills of animals has been a longstanding challenge in robotics. While manually-designed controllers have been able to emulate many complex behaviors, building such controllers involves a time-consuming and difficult development process, often requiring substantial expertise of the nuances of each skill. Reinforcement learning provides an appealing alternative for automating the manual effort involved in the development of controllers. However, designing learning objectives that elicit the desired behaviors from an agent can also require a great deal of skill-specific expertise. In this work, we present an imitation learning system that enables legged robots to learn agile locomotion skills by imitating real-world animals. We show that by leveraging reference motion data, a single learning-based approach is able to automatically synthesize controllers for a diverse repertoire behaviors for legged robots. By incorporating sample efficient domain adaptation techniques into the training process, our system is able to learn adaptive policies in simulation that can then be quickly adapted for real-world deployment. To demonstrate the effectiveness of our system, we train an 18-DoF quadruped robot to perform a variety of agile behaviors ranging from different locomotion gaits to dynamic hops and turns.

Abstract PDF Upgrade to Chat

Citations (447)

View on Semantic Scholar

Summary

The paper presents a novel imitation learning framework that reduces the need for handcrafted controllers by mapping animal motions to robot kinematics.
It employs a three-stage approach of motion retargeting, motion imitation with reinforcement learning, and domain adaptation to achieve agile locomotion.
Empirical results demonstrate that adaptive policies significantly outperform traditional methods in replicating complex maneuvers on quadruped robots.

Learning Agile Robotic Locomotion Skills by Imitating Animals

This paper addresses the complexities of replicating agile animal locomotion in legged robots. It introduces an imitation learning framework designed to utilize reinforcement learning (RL) for training robots to mimic animal movements, thereby reducing the need for handcrafted controllers. Unlike traditional methods, which require extensive manual tuning and expertise, the proposed system automates this process using animal motion data.

Framework and Methodology

The framework operates in three stages: motion retargeting, motion imitation, and domain adaptation.

Motion Retargeting: The process begins with mapping recorded animal motions to a robot's morphology using inverse kinematics. This ensures that motions are compatible with the robot's physical constraints.
Motion Imitation: Employing RL, the system trains a policy in a simulated environment to reproduce the retargeted motions. The state of the robot is captured and fed to a neural network policy, which outputs commands that guide the robot to replicate the motion as closely as possible.
Domain Adaptation: A novel domain randomization technique is applied to transfer the learned policies from simulation to the real world. The training includes variable dynamics parameters to enhance the policy's robustness, while domain adaptation fine-tunes these policies for deployment on actual hardware.

Key Contributions

Versatility in Skills: The framework successfully trains robots in various locomotion skills, such as pacing, trotting, and even executing complex dynamic maneuvers like hop-turns, which are demonstrated using the Laikago quadruped robot.
Efficient Adaptation: The domain adaptation significantly reduces the sample complexity when transferring policies to real-world robots, employing techniques like advantage-weighted regression (AWR) within a learned latent space to further optimize the policy's performance.

Results

Empirical evaluations reveal that the adaptive policies significantly outperform robust non-adaptive and baseline policies in the real world, particularly for complex and dynamic skills. The framework demonstrates improved stability and agility, as the adaptive policies can maintain balance for longer durations compared to their counterparts.

Implications and Future Directions

The ability of this system to train robots to mimic animal agility suggests a leap forward in autonomous robotics, particularly in tasks requiring naturalistic movement patterns in unstructured environments. While the current scope focuses on quadruped robots and a finite set of behaviors, future research could extend to more diverse morphologies or integrate learning from non-mocap data sources, such as videos.

The findings inherently suggest a path towards more generalized and efficient robotic systems that can learn complex tasks with minimal human intervention, aligning closely with the potential advancements in scalable robotic deployment and multi-purpose automation tasks.

This work provides a substantial contribution to the field of robotic locomotion, offering insights and methodologies that may inspire and guide subsequent developments in autonomous behavior learning for robotic systems.