Papers
Topics
Authors
Recent
Search
2000 character limit reached

Ignorance is Bliss: Robust Control via Information Gating

Published 10 Mar 2023 in cs.LG and cs.AI | (2303.06121v2)

Abstract: Informational parsimony provides a useful inductive bias for learning representations that achieve better generalization by being robust to noise and spurious correlations. We propose \textit{information gating} as a way to learn parsimonious representations that identify the minimal information required for a task. When gating information, we can learn to reveal as little information as possible so that a task remains solvable, or hide as little information as possible so that a task becomes unsolvable. We gate information using a differentiable parameterization of the signal-to-noise ratio, which can be applied to arbitrary values in a network, e.g., erasing pixels at the input layer or activations in some intermediate layer. When gating at the input layer, our models learn which visual cues matter for a given task. When gating intermediate layers, our models learn which activations are needed for subsequent stages of computation. We call our approach \textit{InfoGating}. We apply InfoGating to various objectives such as multi-step forward and inverse dynamics models, Q-learning, and behavior cloning, highlighting how InfoGating can naturally help in discarding information not relevant for control. Results show that learning to identify and use minimal information can improve generalization in downstream tasks. Policies based on InfoGating are considerably more robust to irrelevant visual features, leading to improved pretraining and finetuning of RL models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Information dropout: Learning optimal representations through noisy computation. IEEE transactions on pattern analysis and machine intelligence, 40(12):2897–2905, 2018.
  2. Deep variational information bottleneck. arXiv preprint arXiv:1612.00410, 2016.
  3. Masked siamese networks for label-efficient learning. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXI, pages 456–473. Springer, 2022.
  4. Learning representations by maximizing mutual information across views. Advances in neural information processing systems, 32, 2019.
  5. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
  6. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 15750–15758, 2021.
  7. Discovering and removing exogenous state variables and rewards for reinforcement learning. In Jennifer G. Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pages 1261–1269. PMLR, 2018. URL http://proceedings.mlr.press/v80/dietterich18a.html.
  8. Provable rl with exogenous distractors via multistep inverse dynamics. arXiv preprint arXiv:2110.08847, 2021.
  9. Bisimulation metrics for continuous markov decision processes. SIAM Journal on Computing, 40(6):1662–1714, 2011.
  10. Unsupervised learning for physical interaction through video prediction. In Daniel D. Lee, Masashi Sugiyama, Ulrike von Luxburg, Isabelle Guyon, and Roman Garnett, editors, Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain, pages 64–72, 2016. URL https://proceedings.neurips.cc/paper/2016/hash/d9d4f495e875a2e075a1a4a6e1b9770f-Abstract.html.
  11. Learning task informed abstractions. In International Conference on Machine Learning, pages 3480–3491. PMLR, 2021.
  12. Relay policy learning: Solving long-horizon tasks via imitation and reinforcement learning. arXiv preprint arXiv:1910.11956, 2019.
  13. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 297–304. JMLR Workshop and Conference Proceedings, 2010.
  14. Dream to control: Learning behaviors by latent imagination. arXiv preprint arXiv:1912.01603, 2019.
  15. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16000–16009, 2022.
  16. Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261, 2019.
  17. Agent-controller representations: Principled offline rl with rich exogenous information. arXiv preprint arXiv:2211.00164, 2022.
  18. Variational diffusion models. Advances in neural information processing systems, 34:21696–21707, 2021.
  19. Variational dropout and the local reparameterization trick. Advances in neural information processing systems, 28, 2015.
  20. Image augmentation is all you need: Regularizing deep reinforcement learning from pixels. arXiv preprint arXiv:2004.13649, 2020.
  21. Offline reinforcement learning with implicit q-learning. arXiv preprint arXiv:2110.06169, 2021.
  22. Guaranteed discovery of controllable latent states with multi-step inverse models. arXiv preprint arXiv:2207.08229, 2022.
  23. End-to-end training of deep visuomotor policies. J. Mach. Learn. Res., 17:39:1–39:40, 2016. URL http://jmlr.org/papers/v17/15-522.html.
  24. DARTS: differentiable architecture search. In International Conference on Learning Representations (ICLR), 2019.
  25. Challenges and opportunities in offline reinforcement learning from visual observations. arXiv preprint arXiv:2206.04779, 2022.
  26. Vip: Towards universal visual reward and representation via value-implicit pre-training. arXiv preprint arXiv:2210.00030, 2022.
  27. Deep reinforcement and infomax learning. Advances in Neural Information Processing Systems, 33:3686–3698, 2020.
  28. Learning to adapt in dynamic, real-world environments through meta-reinforcement learning. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. URL https://openreview.net/forum?id=HyztsoC5Y7.
  29. R3m: A universal visual representation for robot manipulation. arXiv preprint arXiv:2203.12601, 2022.
  30. Andrew Y Ng. Feature selection, l 1 vs. l 2 regularization, and rotational invariance. In Proceedings of the twenty-first international conference on Machine learning, page 78, 2004.
  31. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
  32. Curiosity-driven exploration by self-supervised prediction. In International conference on machine learning, pages 2778–2787. PMLR, 2017.
  33. Time-contrastive networks: Self-supervised learning from video. In 2018 IEEE international conference on robotics and automation (ICRA), pages 1134–1141. IEEE, 2018.
  34. Adversarial unlearning: Reducing confidence along adversarial directions. arXiv preprint arXiv:2206.01367, 2022.
  35. Adversarial masking for self-supervised learning. In International Conference on Machine Learning, pages 20026–20040. PMLR, 2022.
  36. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958, 2014.
  37. Viewmaker networks: Learning views for unsupervised representation learning. arXiv preprint arXiv:2010.07432, 2020.
  38. Deep learning and the information bottleneck principle. In 2015 ieee information theory workshop (itw), pages 1–5. IEEE, 2015.
  39. The information bottleneck method. arXiv preprint physics/0004057, 2000.
  40. Learning representations for pixel-based control: What matters and why? arXiv preprint arXiv:2111.07775, 2021.
  41. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of machine learning research, 11(12), 2010.
  42. Denoised mdps: Learning world models better than the world itself. arXiv preprint arXiv:2206.15477, 2022.
Citations (8)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.