Papers
Topics
Authors
Recent
Search
2000 character limit reached

Action Anticipation with Goal Consistency

Published 26 Jun 2023 in cs.CV | (2306.15045v1)

Abstract: In this paper, we address the problem of short-term action anticipation, i.e., we want to predict an upcoming action one second before it happens. We propose to harness high-level intent information to anticipate actions that will take place in the future. To this end, we incorporate an additional goal prediction branch into our model and propose a consistency loss function that encourages the anticipated actions to conform to the high-level goal pursued in the video. In our experiments, we show the effectiveness of the proposed approach and demonstrate that our method achieves state-of-the-art results on two large-scale datasets: Assembly101 and COIN.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. “Rolling-unrolling lstms for action anticipation from first-person video,” TPAMI 2020.
  2. “Multi-modal temporal convolutional network for anticipating actions in egocentric videos,” in CVPRW 2021.
  3. “Self-supervised learning for unintentional action prediction,” in DAGM GCPR 2022.
  4. “Rethinking learning approaches for long-term action anticipation,” in ECCV 2022.
  5. “MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition,” in CVPR 2022.
  6. “Anticipative Video Transformer,” in ICCV 2021.
  7. “Red: Reinforced encoder-decoder networks for action anticipation,” in BMVC 2017.
  8. “Anticipating visual representations from unlabeled video,” in CVPR 2016.
  9. “Forecasting human-object interaction: Joint prediction of motor attention and actions in first person video,” in ECCV 2020.
  10. “Forecasting action through contact representations from first person video,” TPAMI 2021.
  11. “Leveraging the present to anticipate the future in videos,” in CVPRW 2019.
  12. “Recurrent neural networks for driver activity anticipation via sensory-fusion architecture,” in ICRA 2016.
  13. “The epic-kitchens dataset: Collection, challenges and baselines,” TPAMI 2021.
  14. “Assembly101: A large-scale multi-view video dataset for understanding procedural activities,” CVPR 2022.
  15. “In the eye of beholder: Joint learning of gaze and actions in first person video,” in ECCV 2018.
  16. “When will you do what? - anticipating temporal occurrences of activities,” in CVPR 2018.
  17. “Time-conditioned action anticipation in one shot,” in CVPR 2019.
  18. “Future transformer for long-term action anticipation,” in CVPR 2022.
  19. “Attention is all you need,” in NIPS 2017.
  20. “Temporal aggregate representations for long-range video understanding,” in ECCV 2020.
  21. “Non-local neural networks,” CVPR 2018.
  22. “Real-time online video detection with temporal smoothing transformers,” in ECCV 2022.
  23. “Intention-based long-term human motion anticipation,” 3DV 2021.
  24. “Intention-conditioned long-term human egocentric action anticipation,” in WACV 2023.
  25. “Action anticipation using latent goal learning,” in WACV 2022.
  26. Yongming Rao Yu Zheng Danyang Zhang Lili Zhao Jiwen Lu Jie Zhou Yansong Tang, Dajun Ding, “Coin: A large-scale dataset for comprehensive instructional video analysis,” CVPR 2019.
Citations (8)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.