Papers
Topics
Authors
Recent
Search
2000 character limit reached

Robust Model Based Reinforcement Learning Using $\mathcal{L}_1$ Adaptive Control

Published 21 Mar 2024 in eess.SY, cs.LG, and cs.SY | (2403.14860v1)

Abstract: We introduce $\mathcal{L}_1$-MBRL, a control-theoretic augmentation scheme for Model-Based Reinforcement Learning (MBRL) algorithms. Unlike model-free approaches, MBRL algorithms learn a model of the transition function using data and use it to design a control input. Our approach generates a series of approximate control-affine models of the learned transition function according to the proposed switching law. Using the approximate model, control input produced by the underlying MBRL is perturbed by the $\mathcal{L}_1$ adaptive control, which is designed to enhance the robustness of the system against uncertainties. Importantly, this approach is agnostic to the choice of MBRL algorithm, enabling the use of the scheme with various MBRL algorithms. MBRL algorithms with $\mathcal{L}_1$ augmentation exhibit enhanced performance and sample efficiency across multiple MuJoCo environments, outperforming the original MBRL algorithms, both with and without system noise.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Evaluation of an ℒ1subscriptℒ1\mathcal{L}_{1}caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT Adaptive Flight Control Law on Calspan’s variable-stability Learjet. AIAA Journal of Guidance, Control, and Dynamics, 40(4):1051–1060, 2017.
  2. Model Reference Adaptive Control for Online Policy Adaptation and Network Synchronization. In IEEE Conference on Decision and Control, pp.  4071–4076, 2021.
  3. Computer-Controlled Systems: Theory and Design. Courier Corporation, 2013.
  4. OpenAI gym. arXiv preprint arXiv:1606.01540, 2016.
  5. Context-Aware Safe Reinforcement Learning for Non-stationary Environments. In IEEE International Conference on Robotics and Automation, pp.  10689–10695, 2021.
  6. Improving the Robustness of Reinforcement Learning Policies with ℒ1subscriptℒ1\mathcal{L}_{1}caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT Adaptive Control. IEEE Robotics and Automation Letters, 7(3):6574–6581, 2022.
  7. Reinforcement Learning for Safety-Critical Control under Model Uncertainty, using Control Lyapunov Functions and Control Barrier Functions. In Robotics: Science and Systems, 2020.
  8. Deep Reinforcement Learning in a Handful of Trials Using Probabilistic Dynamics Models. Advances in Neural Information Processing Systems, 31, 2018.
  9. Model-Based Reinforcement Learning via Meta-Policy Optimization. In Conference on Robot Learning, pp.  617–629, 2018.
  10. Gaussian Processes for Data-Efficient Learning in Robotics and Control. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(2):408–423, 2013.
  11. Model-agnostic Meta-Learning for Fast Adaptation of Deep Networks. In International Conference on Machine Learning, pp. 1126–1135, 2017.
  12. ℒ1subscriptℒ1\mathcal{L}_{1}caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT Adaptive Control Design for NASA AirSTAR Flight Test Vehicle. In AIAA Guidance, Navigation, and Control Conference, pp. 5738, 2009.
  13. Flight Test of an ℒ1subscriptℒ1\mathcal{L}_{1}caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT Adaptive Controller on the NASA AirSTAR Flight Test Vehicle. In AIAA Guidance, Navigation, and Control Conference, pp. 8015, 2010.
  14. Learning Continuous Control Policies by Stochastic Value Gradients. arXiv preprint arXiv:1510.09142, 2015.
  15. â„’1subscriptâ„’1{\mathcal{L}_{1}}caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT Adaptive Control Theory: Guaranteed Robustness with Fast Adaptation. SIAM, 2010.
  16. How to train your robot with deep reinforcement learning: Lessons we have learned. The International Journal of Robotics Research, 40(4-5):698–721, 2021.
  17. Hassan K Khalil. Nonlinear Systems Third Edition. Patience Hall, 2002.
  18. Evgeny Kharisov. â„’1subscriptâ„’1\mathcal{L}_{1}caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT Adaptive Output-Feedback Control Architectures. PhD thesis, 2013.
  19. Probabilistic Safety Constraints for Learned High Relative Degree System Dynamics. In Learning for Dynamics and Control, pp.  781–792, 2020.
  20. MOREL: Model-Based Offline Reinforcement Learning. Advances in Neural Information Processing Systems, 33:21810–21823, 2020.
  21. Planning with Learned Dynamics: Probabilistic Guarantees on Safety and Reachability via Lipschitz Constants. IEEE Robotics and Automation Letters, 6(3):5129–5136, 2021.
  22. Gaussian Process Model Based Predictive Control. In IEEE American Control Conference, volume 3, pp. 2214–2219, 2004.
  23. Learning-Based Model Predictive Control for Safe Exploration. In IEEE Conference on Decision and Control, pp.  6059–6066, 2018.
  24. Model-Ensemble Trust-Region Policy Optimization. In International Conference on Learning Representations, 2018.
  25. Safe Feedback Motion Planning: A Contraction Theory and ℒ1subscriptℒ1\mathcal{L}_{1}caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT Adaptive Control Based Approach. In IEEE Conference on Decision and Control, pp.  1578–1583, 2020.
  26. Guided Policy Search. In International Conference on Machine Learning, pp.  1–9, 2013.
  27. Deep Reinforcement Learning for Pedestrian Collision Avoidance and Human-Machine Cooperative Driving. Information Sciences, 532:110–124, 2020.
  28. Continuous Control with Deep Reinforcement Learning. In International Conference on Learning Representations, 2016.
  29. Deep Drone Racing: From Simulation to Reality with Domain Randomization. IEEE Transactions on Robotics, 36(1):1–14, 2019.
  30. Robust Learning-Based MPC for Nonlinear Constrained Systems. Automatica, 117:108948, 2020.
  31. Elliott Mendelson. Schaum’s Outline of Calculus. McGraw-Hill Education, 2022.
  32. Visual SLAM for automated driving: Exploring the applications of deep learning. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp.  247–257, 2018.
  33. Human-level Control through Deep Reinforcement Learning. Nature, 518(7540):529–533, 2015.
  34. Model-Based Reinforcement Learning: A survey. Foundations and Trends in Machine Learning, 16(1):1–118, 2023.
  35. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning. In IEEE International Conference on Robotics and Automation, pp.  7559–7566, 2018.
  36. Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning. In International Conference on Learning Representations, 2019a.
  37. Deep Online Learning Via Meta-Learning: Continual Adaptation for Model-Based RL. In International Conference on Learning Representations, 2019b.
  38. Design and Wind-Tunnel Analysis of a Fully Adaptive Aircraft Configuration. In AIAA Structures, Structural Dynamics & Materials Conference, pp.  1727, 2004.
  39. Review of Deep Reinforcement Learning for Robot Manipulation. In IEEE International Conference on Robotic Computing, pp. 590–595, 2019.
  40. Gain Scheduling for H-infinity Controllers: A Flight Control Example. IEEE Transactions on Control Systems Technology, 1(2):69–79, 1993.
  41. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization. In IEEE International Conference on Robotics and Automation, pp.  3803–3810, 2018.
  42. Robust Adversarial Reinforcement Learning. In International Conference on Machine Learning, pp. 2817–2826, 2017.
  43. ℒ1subscriptℒ1\mathcal{L}_{1}caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-Adaptive MPPI Architecture for Robust and Agile Control of Multirotors. In IEEE International Conference on Intelligent Robots and Systems, pp.  7661–7666, 2020.
  44. Dataset Shift in Machine Learning. Mit Press, 2008.
  45. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp.  627–635, 2011.
  46. Trust Region Policy Optimization. In International Conference on Machine Learning, pp. 1889–1897, 2015.
  47. Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347, 2017.
  48. Episodic Learning with Control Lyapunov Functions for Uncertain Robotic Systems. In IEEE International Conference on Intelligent Robots and Systems, pp.  6878–6884, 2019.
  49. Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World. In IEEE International Conference on Intelligent Robots and Systems, pp.  23–30, 2017.
  50. Exploring Model-based Planning with Policy Networks. In International Conference on Learning Representations, 2020.
  51. Benchmarking Model-Based Reinforcement Learning. arXiv preprint arXiv:1907.02057, 2019.
  52. ℒ1subscriptℒ1\mathcal{L}_{1}caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT Adaptive Controller for Nonlinear Reference Systems. In IEEE American Control Conference, pp.  594–599, 2011.
  53. Video Captioning via Hierarchical Reinforcement Learning. In IEEE Conference on Computer Vision and Pattern Recognition, pp.  4213–4222, 2018.
  54. Deep Reinforcement Learning: A Survey. IEEE Transactions on Neural Networks and Learning Systems, 2022.
  55. Learning to See Physics via Visual De-animation. Advances in Neural Information Processing Systems, 30, 2017.
  56. A Study of Reinforcement Learning for Neural Machine translation. In Conference on Empirical Methods in Natural Language Processing, pp.  3612–3621, 2018.
  57. ℒ1subscriptℒ1\mathcal{L}_{1}caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT Adaptive Augmentation for Geometric Tracking Control of Quadrotors. In International Conference on Robotics and Automation, pp. 1329–1336, 2022.
  58. MOPO: Model-Based Offline Policy Optimization. Advances in Neural Information Processing Systems, 33:14129–14142, 2020.
  59. Action-Decision Networks for Visual Tracking with Deep Reinforcement learning. In IEEE Conference on Computer Vision and Pattern Recognition, pp.  2711–2720, 2017.
  60. Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: A Survey. In IEEE Symposium Series on Computational Intelligence, pp. 737–744, 2020.
  61. Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function. In The International Conference on Learning Representations, 2022.

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.