Papers
Topics
Authors
Recent
Search
2000 character limit reached

Interpreting Neural Networks through Mahalanobis Distance

Published 25 Oct 2024 in cs.LG, cs.AI, and stat.ML | (2410.19352v1)

Abstract: This paper introduces a theoretical framework that connects neural network linear layers with the Mahalanobis distance, offering a new perspective on neural network interpretability. While previous studies have explored activation functions primarily for performance optimization, our work interprets these functions through statistical distance measures, a less explored area in neural network research. By establishing this connection, we provide a foundation for developing more interpretable neural network models, which is crucial for applications requiring transparency. Although this work is theoretical and does not include empirical data, the proposed distance-based interpretation has the potential to enhance model robustness, improve generalization, and provide more intuitive explanations of neural network decisions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Explainable ai frameworks: Navigating the present challenges and unveiling innovative applications. Algorithms, 17(6):227, 2024. doi: 10.3390/a17060227.
  2. Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006.
  3. Weight uncertainty in neural networks. In Proceedings of the 32nd International Conference on Machine Learning (ICML), pages 1613–1622. PMLR, 2015.
  4. Radial basis functions, multi-variable functional interpolation and adaptive networks. Royal Signals and Radar Establishment Malvern (United Kingdom) RSRE Memorandum, 4148, 1988.
  5. Fast and accurate deep network learning by exponential linear units (elus). In 4th International Conference on Learning Representations (ICLR), 2016. URL https://arxiv.org/abs/1511.07289.
  6. The mahalanobis distance. Chemometrics and Intelligent Laboratory Systems, 50(1):1–18, 2000. doi: 10.1016/S0169-7439(99)00047-7.
  7. Visualizing higher-layer features of a deep network. In Technical Report, University of Montreal, 2009.
  8. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), pages 226–231. AAAI Press, 1996.
  9. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), pages 249–256, 2010.
  10. Interpreting artificial intelligence models: A systematic review on the application of lime and shap in alzheimer’s disease detection. Brain Informatics, 11(1):1–19, 2024. doi: 10.1186/s40708-023-00195-9.
  11. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, pages 1026–1034, 2015.
  12. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016. URL https://arxiv.org/abs/1606.08415.
  13. Long short-term memory. Neural Computation, 9(8):1735–1780, 1997.
  14. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning (ICML), pages 448–456, 2015. URL https://arxiv.org/abs/1502.03167.
  15. Adaptive mixtures of local experts. Neural Computation, 3(1):79–87, 1991.
  16. Ian T. Jolliffe. Principal Component Analysis. Springer, 2 edition, 2002.
  17. A survey of computational imaging methods for inverse problems. IEEE Signal Processing Magazine, 34(6):85–95, 2017.
  18. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In Proceedings of the 35th International Conference on Machine Learning (ICML), pages 2668–2677. PMLR, 2018a.
  19. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In International Conference on Machine Learning, pages 2668–2677. PMLR, 2018b.
  20. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, volume 25, pages 1097–1105, 2012.
  21. Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
  22. Zachary C. Lipton. The mythos of model interpretability. arXiv preprint arXiv:1606.03490, 2016. URL https://arxiv.org/abs/1606.03490.
  23. ngpt: Normalized transformer with representation learning on the hypersphere, 2024. URL https://arxiv.org/abs/2410.01131.
  24. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30:4765–4774, 2017.
  25. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30, page 3, 2013.
  26. J. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 281–297. University of California Press, 1967a.
  27. J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, pages 281–297. University of California Press, 1967b.
  28. Prasanta Chandra Mahalanobis. On the generalized distance in statistics. Proceedings of the National Institute of Sciences of India, 2(1):49–55, 1936.
  29. Mark Markov. Lime vs shap: A comparative analysis of interpretability tools. MarkovML Blog, 2020. URL https://www.markovml.com/blog/lime-vs-shap-comparative-analysis.
  30. Perceptrons: An Introduction to Computational Geometry. MIT Press, Cambridge, MA, USA, 1969. ISBN 9780262631832.
  31. Fionn Murtagh. A survey of recent advances in hierarchical clustering algorithms. The Computer Journal, 26(4):354–359, 1983.
  32. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML), pages 807–814, 2010.
  33. Radford M. Neal. Bayesian Learning for Neural Networks, volume 118 of Lecture Notes in Statistics. Springer-Verlag, 1996.
  34. Searching for activation functions. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=Hk4_qw5K-. arXiv preprint arXiv:1710.05941.
  35. Douglas A. Reynolds. Gaussian mixture models. In Encyclopedia of Biometrics, pages 659–663. Springer, 2009.
  36. ”why should i trust you?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144. ACM, 2016.
  37. Frank Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6):386–408, 1958.
  38. Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206–215, 2019.
  39. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013. URL https://arxiv.org/abs/1312.6034.
  40. Ulrike Von Luxburg. A tutorial on spectral clustering. Statistics and Computing, 17(4):395–416, 2007.
  41. Distance metric learning for large margin nearest neighbor classification. In International Conference on Machine Learning (ICML), pages 1317–1324, 2009. URL https://proceedings.mlr.press/v5/weinberger09a.html.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 8 tweets with 1191 likes about this paper.