Papers
Topics
Authors
Recent
Search
2000 character limit reached

Towards Gradient-based Time-Series Explanations through a SpatioTemporal Attention Network

Published 18 May 2024 in cs.CV and cs.LG | (2405.17444v1)

Abstract: In this paper, we explore the feasibility of using a transformer-based, spatiotemporal attention network (STAN) for gradient-based time-series explanations. First, we trained the STAN model for video classifications using the global and local views of data and weakly supervised labels on time-series data (i.e. the type of an activity). We then leveraged a gradient-based XAI technique (e.g. saliency map) to identify salient frames of time-series data. According to the experiments using the datasets of four medically relevant activities, the STAN model demonstrated its potential to identify important frames of videos.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).
  2. Human-centered tools for coping with imperfect algorithms during medical decision-making. In Proceedings of the 2019 chi conference on human factors in computing systems. 1–14.
  3. Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. In CVPR.
  4. Joao Carreira and Andrew Zisserman. 2017. Quo vadis, action recognition? a new model and the kinetics dataset. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6299–6308.
  5. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, 839–847.
  6. Conditional positional encodings for vision transformers. arXiv preprint arXiv:2102.10882 (2021).
  7. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
  8. Multiscale vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision. 6824–6835.
  9. Towards understanding ECG rhythm classification using convolutional neural networks and attention mappings. In Machine learning for healthcare conference. PMLR, 83–101.
  10. Dan Hendrycks and Kevin Gimpel. 2016. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415 (2016).
  11. Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning. pmlr, 448–456.
  12. How much position information do convolutional neural networks encode? arXiv preprint arXiv:2001.08248 (2020).
  13. Benchmarking deep learning interpretability in time series predictions. Advances in neural information processing systems 33 (2020), 6441–6452.
  14. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In International conference on machine learning. PMLR, 2668–2677.
  15. Explaining machine learning predictions: State-of-the-art, challenges, and opportunities. NeurIPS Tutorial (2020).
  16. Min Hun Lee and Yi Jing Choy. 2023. Exploring a Gradient-based Explainable AI Technique for Time-Series Data: A Case Study of Assessing Stroke Rehabilitation Exercises. arXiv preprint arXiv:2305.05525 (2023).
  17. Towards personalized interaction and corrective feedback of a socially assistive robot for post-stroke rehabilitation therapy. In 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). IEEE, 1366–1373.
  18. A human-ai collaborative approach for clinical decision making on rehabilitation assessment. In Proceedings of the 2021 CHI conference on human factors in computing systems. 1–14.
  19. Uniformer: Unified transformer for efficient spatiotemporal representation learning. arXiv preprint arXiv:2201.04676 (2022).
  20. Ilya Loshchilov and Frank Hutter. 2016. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016).
  21. Ilya Loshchilov and Frank Hutter. 2018. Fixing weight decay regularization in adam. (2018).
  22. Learning saliency maps to explain deep time series classifiers. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 1406–1415.
  23. AI in health and medicine. Nature medicine 28, 1 (2022), 31–38.
  24. ” Why should i trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135–1144.
  25. Explainable artificial intelligence (xai) on timeseries data: A survey. arXiv preprint arXiv:2104.00950 (2021).
  26. Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. arXiv preprint arXiv:1708.08296 (2017).
  27. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4510–4520.
  28. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618–626.
  29. Ntu rgb+ d: A large scale dataset for 3d human activity analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1010–1019.
  30. Wataru Shimoda and Keiji Yanai. 2016. Distinct class-specific saliency maps for weakly supervised semantic segmentation. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14. Springer, 218–234.
  31. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013).
  32. Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825 (2017).
  33. Mlp-mixer: An all-mlp architecture for vision. Advances in neural information processing systems 34 (2021), 24261–24272.
  34. What went wrong and when? Instance-wise feature importance for time-series black-box models. Advances in Neural Information Processing Systems 33 (2020), 799–809.
  35. Should health care demand interpretable artificial intelligence or accept “black box” medicine? , 59–60 pages.
  36. Focal self-attention for local-global interactions in vision transformers. arXiv preprint arXiv:2107.00641 (2021).
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 0 likes about this paper.