Pre or Post-Softmax Scores in Gradient-based Attribution Methods, What is Best?

Published 22 Jun 2023 in cs.LG and cs.AI | (2306.13197v2)

Abstract: Gradient based attribution methods for neural networks working as classifiers use gradients of network scores. Here we discuss the practical differences between using gradients of pre-softmax scores versus post-softmax scores, and their respective advantages and disadvantages.

Abstract PDF Upgrade to Chat

Citations (3)

View on Semantic Scholar

Summary

The paper demonstrates that post-softmax gradients improve saliency map reliability in attribution methods by reducing gradient vanishing.
It evaluates multiple techniques like Grad-CAM and Integrated Gradients, highlighting key differences in score selection.
The study emphasizes the need for careful gradient source choice and introduces log-softmax as a potential alternative for robust interpretation.

Evaluating the Use of Pre or Post-Softmax Scores in Gradient-Based Attribution Methods

The paper "Pre or Post-Softmax Scores in Gradient-based Attribution Methods, What is Best?" examines the practical differences, advantages, and disadvantages of utilizing pre-softmax versus post-softmax scores in gradient-based attribution methods. These methods are pivotal in the field of Explainable Artificial Intelligence (XAI), specifically for neural networks serving as classifiers.

Key Insights and Methodological Variations

The authors begin by contextualizing the role of attribution algorithms in XAI, highlighting their function in quantifying the influence of each input feature on outputs. Gradients, derived concerning the neural network's scores, are essential to these methods, where classifiers like the VGG family employ a final layer culminating in softmax outputs.

The pre-softmax scores represent raw outputs, transformed by softmax into probabilities that are used in training with loss functions. This paper interrogates whether it is more advantageous to compute gradients from pre-softmax or post-softmax scores and reviews various attribution methods:

Grad-CAM processes pre-softmax scores, although variations exist using post-softmax scores. Issues such as gradient vanishing in saturated outputs are noted.
Integrated Gradients (IG) inherently uses post-softmax scores to retain model-agnostic properties.
RSI Grad-CAM, utilizing post-softmax scores, employs interpolation to mitigate vanishing gradients, differing from Grad-CAM.
Grad-CAM++ and Grad-CAM+ explore implementations with pre-softmax scores following exponential functions.

Analytical Framework and Theoretical Implications

The authors delve deeply into the mathematical underpinnings, analyzing how gradients propagate through softmax layers and explore the relationship between pre- and post-softmax score gradients. Real-world application implications are evident:

The post-softmax gradients directly correlate with the model's information gain and prediction impact.
Saliency maps, derived from these gradients, could vary significantly in interpretations, contingent on the choice of pre- or post-softmax gradients.

The paper reveals that equivalent post-softmax outputs in different models can mask varying pre-softmax gradients, raising questions about the robustness and consistency of attribution results across different model instances.

Impact on Loss Functions

The paper evaluates the connection of these gradients with cross-entropy loss functions prevalent in training. It demonstrates that the post-softmax gradients singularly contribute towards minimizing loss, suggesting that they better capture the impact of features on classification outputs. This insight challenges assumptions about gradient utility and necessitates a closer consideration of gradient flows in model explainability.

Log-Softmax as an Emerging Alternative

Interestingly, log-softmax scores are presented as a potential yet underexplored alternative. Initial experiments suggest limited differentiation from using plain post-softmax scores, warranting further exploration within new attribution frameworks.

Conclusion and Forward-looking Perspectives

Conclusively, the paper suggests favoring post-softmax scores in gradient-based attribution, especially for methods susceptible to vanishing gradients. Methods resistant to such issues, like RSI Grad-CAM, benefit from the robustness provided by post-softmax scores. The authors propose additional attention be directed towards log-softmax scores as they might offer unexplored advantages.

This paper contributes meaningful insights that refine the understanding of score selection in neural network interpretation, urging future research in attribution methodology and the extended application of log-softmax scores for a more nuanced explanation of AI behavior.

Markdown Report Issue