SHAP Values for Model Interpretation

Updated 25 January 2026

SHAP values are local, model-agnostic feature-attribution scores based on Shapley values that quantify each feature's contribution to a model's prediction.
They leverage rigorous axioms and efficient algorithms like KernelSHAP and TreeSHAP to provide computationally tractable, interpretable insights.
Applied across domains from tabular data to deep learning, SHAP values support feature selection, global importance measures, and robustness in adversarial settings.

SHAP (SHapley Additive exPlanations) values are a family of local, model-agnostic feature-attribution scores that explain predictions of machine learning models by allocating the predicted value among input features according to cooperative game-theoretic Shapley values. They provide both rigorous axiomatic guarantees and practical algorithms for interpreting complex models in fields ranging from tabular learning to deep neural networks and sequential decision processes. The SHAP framework encompasses a growing range of computational, statistical, and application-specific advancements.

1. Formal Definition and Theoretical Foundations

Let $f:\mathbb{R}^p \to \mathbb{R}$ be a predictive model and $x = (x_1, ..., x_p) \in \mathbb{R}^p$ an input instance. SHAP values attribute to each feature $i$ a value $\phi_i(f, x)$ representing its contribution to $f(x)$ , using the classical Shapley value principle from cooperative game theory. The SHAP value for feature $i$ is

$\phi_i(f, x) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|! (p - |S| - 1)!}{p!} \left[ f_{S \cup \{i\}}(x_{S \cup \{i\}}) - f_S(x_S) \right],$

where $N = \{1, \dots, p\}$ , $S$ is any subset not containing $i$ , $f_S(x_S)$ denotes the model’s output when only features in $S$ are "present" (others marginalized or set to baseline), and the weight is proportional to the number of orderings in which $S$ precedes $i$ .

Key properties inherited from Shapley values:

Local accuracy (efficiency): $\sum_{i=1}^p \phi_i(f, x) = f(x) - \mathbb{E}[f(x)]$ .
Consistency (monotonicity): If a feature's marginal contribution increases in a model, its SHAP value does not decrease.
Symmetry: Identically contributing features receive equal SHAP values.
Dummy: Features never affecting $f$ have zero SHAP value.
Additivity: Attributions for $f+g$ are sums of the attributions for $f$ and $g$ .

The conditional expectation $f_S(x_S)$ is often operationalized via the "interventional" or "background" approach, marginalizing out missing features over the empirical distribution (Kraev et al., 2024, Zeng, 2024, 2207.14490).

2. Computational Methodologies and Algorithmic Advances

Computing SHAP values exactly requires exponential time in the number of features; tractable approximations and closed-form strategies have been devised for practical use.

KernelSHAP is a model-agnostic approximation: it samples random subsets $S$ , fits a weighted linear model to the outputs, and solves a least-squares regression with Shapley weights to produce $\phi_i$ estimates (Hamilton et al., 2023).

TreeSHAP leverages the structure of decision trees and ensembles to compute all $\phi_i$ for one sample in $O(T L^2)$ time ( $T$ = #trees, $L$ = max leaves per tree) using dynamic programming and path-counting, and is exact for tree models (Mitchell et al., 2020, 2207.14490, Yang, 2021). Accelerations include:

Fast TreeSHAP v1 significantly reduces dynamic programming overhead via path subset pruning.
Fast TreeSHAP v2 precomputes combinatorial contributions per path, further improving throughput for batch explanations at the cost of increased memory (Yang, 2021).
GPUTreeShap achieves up to $10$– $100\times$ speedups via parallel GPU implementations (Mitchell et al., 2020).

WOODELF unifies path-dependent and background SHAP computations in tree ensembles through pseudo-Boolean encoding, enabling linear-time evaluation on both CPUs and GPUs, with support for interaction and Banzhaf values (Nadel et al., 12 Nov 2025).

Fourier- and Spectral SHAP: Models with underlying low-degree or sparse spectral structure allow closed-form or near-closed-form computation of SHAP values. Sparse Fourier expansions (Walsh–Hadamard or general tensor-product bases) enable linear-time SHAP evaluation given a compact surrogate, with rigorous error bounds backed by Lipschitz continuity and concentration inequalities (Gorji et al., 2024, Morales, 31 Oct 2025).

Choquet/Interaction-based Approximations: By fitting $k$ -additive Choquet integrals, SHAP and Shapley interaction indices can be efficiently approximated up to polynomial order in the number of features for moderate $k$ (Pelegrina et al., 2022).

3. Practical Applications and Interpretability

SHAP values support a broad range of interpretability tasks:

Feature Selection: SHAP-based approaches (e.g., shap-select) perform feature selection in tabular machine learning by regressing the target on SHAP values and dropping features with non-significant or negative coefficients, yielding efficient and interpretable subsets with minimal retraining (Kraev et al., 2024). Feature reselection under secondary constraints (fairness, robustness) can be efficiently performed by SHAP plus correlation grouping (REFRESH), sidestepping full retraining (Sharma et al., 2024).
Global Importance and Aggregation: Averaging $|\phi_i|$ over data points provides global feature importances, but naïve averaging on the empirical support may not guarantee safe feature elimination in all models. Computing such aggregates on the product of marginals (via random column permutation) ensures theoretical soundness for feature pruning (Bhattacharjee et al., 29 Mar 2025).
Text, Vision, and Black-Box Models: SHAP values extend the local additive explanation paradigm to CNN-based text models by computing SHAP for max-pooled n-gram activations, offering local and global phrase-level explanations (Zhao et al., 2020). In computer vision, SHAP-guided adversarial attacks leverage attribution saliencies to construct robust, imperceptible image perturbations, outperforming gradient-based baselines in several settings (Mollard et al., 15 Jan 2026).
Clustering and Pathway Discovery: Clustering SHAP vectors over samples enables the discovery of distinct decision pathways—even when predictions are the same—revealing explanatory heterogeneity in both simulated and clinical (e.g., Alzheimer's) data (Lin et al., 9 Oct 2025).
Trustable and Axiomatically Compliant SHAP Scores: Pathological examples show that classical SHAP scores can sometimes misattribute importance to irrelevant features. Modifications using binarized response functions restore provable relevance compliance while retaining the Shapley axioms (Letoffe et al., 2024).

4. Statistical, Spectral, and Causal Insights

SHAP and Partial Dependence: In fully additive models (e.g., tree stumps or ensembles with interaction constraints), SHAP dependence plots for a feature are vertically shifted versions of the corresponding partial dependence plots; their shapes coincide, confirming their interpretation as main effects when interaction is absent (2207.14490).

Fourier/Spectral Interpretations: SHAP attributions can be understood as linear functionals of a model's Fourier coefficients, with the SHAP value of feature $i$ at instance $x^*$ expressible as

$\phi_i(f; x^*) = \sum_{k: k_i \neq 0} \hat{f}(k) \frac{\Psi_k(x^*)}{d(k)},$

where $\Psi_k$ is the orthonormal basis function, $\hat{f}(k)$ its coefficient, and $d(k)$ its interaction order. This spectral perspective yields deterministic and high-probability error bounds under model truncation or approximation and enables efficient surrogate-based computation with tight control over expansion error (Morales, 31 Oct 2025, Gorji et al., 2024).

Axiomatic and Algebraic Structure: The "Shapley Lie algebra" framework recasts the set of value operators arising in SHAP decompositions as a solvable Lie algebra, yielding invertibility, upper triangularization, and robust guarantees under manipulation of the support or background distribution (Bhattacharjee et al., 29 Mar 2025).

5. Comparative Benchmarks and Model-specific SHAP Extensions

Performance Benchmarks: In high-dimensional tabular settings (e.g., fraud detection with 30+ features and 285,000 samples), SHAP-select achieves competitive accuracy and F1 while delivering a $5\times$ – $100\times$ runtime advantage over methods requiring repeated model retraining (e.g., Boruta, HISEL) (Kraev et al., 2024).

Method	#Features	Accuracy	F1-Score	Runtime (s)
shap-select	6	0.999596	0.870056	21.8
Boruta	11	0.999631	0.881356	95.85
HISEL	30	0.999561	0.858757	109.03
No selection	30	0.999561	0.858757	1.56

Large-Scale Acceleration: WOODELF computes full background SHAP for tree models over millions of rows in under 20 seconds on modern GPUs—a $20\times$ to $5,000\times$ speedup relative to prior state-of-the-art (Nadel et al., 12 Nov 2025).

Two-Part and Composed Models: mSHAP provides an efficient, additive approximation for multiplicative models (e.g., insurance ratemaking), combining the SHAP values of the constituent parts, with negligible loss in fidelity relative to kernel-based baselines and vastly improved runtime (Matthews et al., 2021).

6. Limitations, Open Problems, and Future Directions

SHAP values provide theoretically optimal local explanations under broad conditions, but several subtleties and limitations remain:

Off-manifold masking and safe pruning: Zero aggregate SHAP on training support does not imply feature irrelevance globally; extended-support aggregation is necessary for theoretical soundness (Bhattacharjee et al., 29 Mar 2025).
Numerical specification and class labeling: Standard SHAP can fail to reflect true feature relevance, especially in deterministic or numerically degenerate classifiers. Trustable SHAP variants using class label binarization restore ordinal correctness (Letoffe et al., 2024).
Collinearity: Iterative regularization or grouping are necessary to avoid elimination of all collinear predictors in feature selection workflows (Kraev et al., 2024).
Model class restrictions: Tree-specific and spectral acceleration techniques may not yet fully generalize to deep or kernel-based models; extending efficient SHAP computation to these domains remains an active area (Kraev et al., 2024, Nadel et al., 12 Nov 2025).
Conditional expectation vs. marginalization: Most practical SHAP implementations use marginals for computational reasons; in the presence of highly dependent features, this can yield misleading attributions.

Future work targets hybrid methods (combining feedback-driven language-modeling for interpretability (Zeng, 2024), spectral/conditional representation learning, domain adaptation, and integration with fairness and robustness guarantees) to further increase the power and transparency of SHAP-based attribution across all model types and data modalities.

References: