Local and Global Explanation Methods
- Local and global explanation methods are formal approaches that decompose model predictions into individual feature contributions and interaction effects for enhanced interpretability.
- They link partial dependence plots and interventional SHAP values through an ANOVA-style functional decomposition, clarifying both local and global model behavior.
- They support fairness interventions by enabling the precise removal of protected feature influences, thereby isolating and mitigating bias in model predictions.
Local and global explanation methods are formal approaches used in interpretable machine learning to elucidate how complex models make decisions. Local explanation methods characterize the prediction for an individual instance, typically by attributing contributions to features relevant for that point. Global explanation methods aim to summarize how a model behaves on average, across the entire input domain, by decomposing its prediction function or aggregating local mechanisms. The distinction, and the methodologies that bridge it, are central to modern explainable AI, spanning regression, classification, time-series, text, and vision domains.
1. Formal Decomposition and Marginal Identification
Functional decomposition is a rigorous approach to unify local and global explanation methods by expressing any prediction function as a sum over all feature subsets:
Here, (intercept), encodes the main effect of feature , and for encodes pure interactions. This ANOVA-style expansion is not unique unless an additional identification constraint is imposed.
The marginal-identification constraint requires that for all ,
where is the marginal density of . This uniquely pins down each via Möbius inversion:
This identification recovers core local and global quantities. Specifically, for any singleton , the partial dependence plot (PDP) coincides with , while interventional SHAP values use weighted sums over main effect and interaction components (Hiabu et al., 2022).
2. Relationships with Core Attribution Methods
2.1 Partial Dependence and Feature Effects
The partial dependence plot for is:
Under marginal identification, this simplifies to . For any feature subset ,
2.2 Interventional SHAP Values
The interventional SHAP value for :
Each interaction term that involves is apportioned $1/|S|$ of its value to feature (Hiabu et al., 2022). This recasting connects Shapley values, previously motivated through game theory, to a functional expansion, exposing the direct tie between local and global explanations.
2.3 Feature Importance Measures
Standard SHAP importance mixes main and interaction effects. The decomposition enables refined importance measures:
For order separation,
3. Algorithms for Exact and Approximate Functional Decomposition
The number of components is generally exponential in , but many machine learning models admit low-dimensional base learners (e.g., tree ensembles with interaction order ). For such models, exact functional decomposition is feasible:
- For each tree, identify the split set .
- For each , recursively compute —the marginal tree value as an expectation over unselected features.
- For each , assemble from via signed sums.
- For XGBoost, Algorithm 3 describes efficient aggregation via tables; for random planted forests, grid-based marginals suffice.
Complexity is for trees of interaction-order and points (Hiabu et al., 2022).
4. Post-hoc De-biasing and Fairness Interventions
Bias removal targeting protected features relies on the additive decomposition. Consider as a block of features (e.g., gender) to eliminate from the model:
Thus, forming a debiased predictor entails simply dropping every component with from the sum. This approach removes both direct and indirect bias via protected features (Hiabu et al., 2022).
5. Key Experimental Results: Fidelity and Separation of Effects
Empirical studies verify the nuances of functional decomposition:
- Toy problems: For with correlated features, standard interventional SHAP may mask main effects due to cancellation. Decomposition yields distinct main/interaction components , , , which can be inspected separately.
- Real data: On bike-sharing, decomposing fitted XGBoost reveals strong interaction terms (e.g., hourworkingday) that standard SHAP summary plots cannot separate.
- Feature importance simulation: When interactions involve specific features (e.g., ), the decomposed importances precisely attribute effects by order.
- De-biasing: When retraining on a subset of unprotected features, standard refitting fails to remove indirect bias. Decomposition allows rigorous elimination, as shown on simulated salary data and UCI Adult.
6. Integration with Broader Explanation Frameworks
Functional decomposition under marginal identification creates a mathematical bridge between local and global explanations, encompassing:
- Local methods: SHAP, LIME, IG (Integrated Gradients) when interpreted via instance-level functional expansions (Tan et al., 2018, Schrouff et al., 2021).
- Global surrogates: Partial dependence plots, additive global explanations, distilled student models (Tan et al., 2018).
- Hybrid local-to-global: Aggregations and hierarchical clustering approaches that build global rules or proxies from local surrogates (Setzu et al., 2021, Seppäläinen et al., 14 Feb 2025, Ramamurthy et al., 2020, Visani et al., 2024).
- Algorithmic transparency: Accessible for exact calculation in low-dimensional/ensemble models, and compatible with rule extraction via logic programming (Takemura et al., 2024).
7. Practical Implications and Trade-offs
This unification addresses critical needs in explainable AI:
- Separates out main/interaction effects so practitioners can diagnose local anomalies, interaction-induced bias, or global functional structure.
- Enables robust feature importance assessments not confounded by interaction-induced cancellation.
- Provides actionable pipelines for post-hoc fairness interventions by component removal, rather than ambiguous retraining.
- Offers scalable algorithms amenable to tree ensembles and structured models.
The key limitation remains the exponential scaling of decomposition with dimension for general black-box models absent low-order structure. Approximate methods, surrogates, and rule-based reductions remain essential for scalability (Seppäläinen et al., 14 Feb 2025, Setzu et al., 2021).
In summary, local and global explanation methods are formalized and unified by functional decomposition with marginal identification constraints. This framework rigorously subsumes interventional SHAP, partial dependence, feature effect curves, and enables principled algorithms for bias reduction and feature importance. Empirical findings demonstrate increased fidelity, interpretability, and robustness, particularly for models and domains with rich interaction structures (Hiabu et al., 2022).