Papers
Topics
Authors
Recent
Search
2000 character limit reached

AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors

Published 15 Mar 2024 in cs.LG and cs.CV | (2403.09976v2)

Abstract: Model-based methods have significantly contributed to distinguishing task-irrelevant distractors for visual control. However, prior research has primarily focused on heterogeneous distractors like noisy background videos, leaving homogeneous distractors that closely resemble controllable agents largely unexplored, which poses significant challenges to existing methods. To tackle this problem, we propose Implicit Action Generator (IAG) to learn the implicit actions of visual distractors, and present a new algorithm named implicit Action-informed Diverse visual Distractors Distinguisher (AD3), that leverages the action inferred by IAG to train separated world models. Implicit actions effectively capture the behavior of background distractors, aiding in distinguishing the task-irrelevant components, and the agent can optimize the policy within the task-relevant state space. Our method achieves superior performance on various visual control tasks featuring both heterogeneous and homogeneous distractors. The indispensable role of implicit actions learned by IAG is also empirically validated.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. LASER: learning a latent action space for efficient reinforcement learning. In IEEE International Conference on Robotics and Automation, ICRA 2021, Xi’an, China, May 30 - June 5, 2021, pp.  6650–6656. IEEE, 2021.
  2. Information prioritization through empowerment in visual model-based RL. In 10th International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022.
  3. Learning action representations for reinforcement learning. In Chaudhuri, K. and Salakhutdinov, R. (eds.), Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97 of Proceedings of Machine Learning Research, pp.  941–950. PMLR, 2019.
  4. Provably efficient rl with rich observations via latent state decoding. In International Conference on Machine Learning, pp.  1665–1674. PMLR, 2019.
  5. Temporal cycle-consistency learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  1801–1810, 2019.
  6. Imitating latent policies from observation. In Chaudhuri, K. and Salakhutdinov, R. (eds.), Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97 of Proceedings of Machine Learning Research, pp.  1755–1763. PMLR, 2019.
  7. Provable rl with exogenous distractors via multistep inverse dynamics. arXiv preprint arXiv:2110.08847, 2021.
  8. Learning task informed abstractions. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pp.  3480–3491. PMLR, 2021.
  9. World models. arXiv preprint arXiv:1803.10122, 2018.
  10. Learning latent dynamics for planning from pixels. In International conference on machine learning, pp.  2555–2565. PMLR, 2019.
  11. Dream to control: Learning behaviors by latent imagination. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020.
  12. Mastering Atari with discrete world models. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021.
  13. Deep hierarchical planning from pixels. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A. (eds.), Advances in Neural Information Processing Systems, volume 35, pp.  26091–26104. Curran Associates, Inc., 2022.
  14. Mastering diverse domains through world models. arXiv preprint arXiv:2301.04104, 2023.
  15. Disentanglement via latent quantization. arXiv preprint arXiv:2305.18378, 2023.
  16. When to trust your model: Model-based policy optimization. Advances in neural information processing systems, 32, 2019.
  17. Efficient planning in a compact latent action space. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023.
  18. The kinetics human action video dataset. arXiv preprint arXiv:1705.06950, 2017.
  19. Learning vector quantization. Self-organizing maps, pp.  245–261, 2001.
  20. Curl: Contrastive unsupervised representations for reinforcement learning. In International Conference on Machine Learning, pp.  5639–5650. PMLR, 2020.
  21. A survey on model-based reinforcement learning. arXiv preprint arXiv:2206.09328, 2022.
  22. Structured world models from human videos. In Bekris, K. E., Hauser, K., Herbert, S. L., and Yu, J. (eds.), Robotics: Science and Systems XIX, Daegu, Republic of Korea, July 10-14, 2023, 2023.
  23. Model-based reinforcement learning: A survey. Foundations and Trends® in Machine Learning, 16(1):1–118, 2023.
  24. Iso-dream: Isolating and leveraging noncontrollable visual dynamics in world models. Advances in neural information processing systems, 35:23178–23191, 2022.
  25. Learning to act without actions. In The Twelfth International Conference on Learning Representations, 2024.
  26. Sutton, R. S. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Machine learning proceedings 1990, pp.  216–224. Elsevier, 1990.
  27. Reinforcement learning: An introduction. MIT press, 2018.
  28. Deepmind control suite. arXiv preprint arXiv:1801.00690, 2018.
  29. Neural discrete representation learning. Advances in neural information processing systems, 30, 2017.
  30. Semail: eliminating distractors in visual imitation via separated models. In International Conference on Machine Learning, pp.  35426–35443. PMLR, 2023.
  31. Denoised MDPs: Learning world models better than the world itself. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pp.  22591–22612. PMLR, 2022.
  32. Learning correspondence from the cycle-consistency of time. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  2566–2576, 2019.
  33. Image augmentation is all you need: Regularizing deep reinforcement learning from pixels. In International conference on learning representations, 2020.
  34. Mastering visual continuous control: Improved data-augmented reinforcement learning. arXiv preprint arXiv:2107.09645, 2021.
  35. Become a proficient player with limited data through watching pure videos. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023.
  36. Mopo: Model-based offline policy optimization. Advances in Neural Information Processing Systems, 33:14129–14142, 2020.
  37. Learning invariant representations for reinforcement learning without reconstruction. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.