- The paper introduces LocalLSSFind, which provably recovers local signed features and interactions under a Locally Spiky Sparse model.
- It employs depth-weighted prevalence and local path prevalence metrics to extract and rank significant feature splits from Random Forests.
- Empirical results and a COMPAS case study demonstrate its potential for personalized explanations and model auditing in risk prediction.
Provable Recovery of Locally Important Signed Features and Interactions from Random Forest
Introduction and Motivation
The paper "Provable Recovery of Locally Important Signed Features and Interactions from Random Forest" (2512.11081) addresses the theoretical and algorithmic gap in local feature and interaction importance (FII) for Random Forests (RFs). While model-agnostic interpretability approaches such as SHAP and LIME provide functional decompositions for FII, they lack statistical guarantees regarding the recovery of true signal features and interactions—particularly with respect to their signed contributions in individual-level predictions. The authors introduce LocalLSSFind, an RF-specific, theoretically grounded local FII method that recovers signed signals and interactions of the underlying data-generating process under a Locally Spiky Sparse (LSS) model.
The LSS model assumes the regression function is a linear sum of Boolean (possibly discontinuous) interactions in thresholded feature space, thus providing a mathematically precise and practically motivated definition of signed feature interactions relevant at the local level. The paper’s results not only formalize which local signed feature patterns can be recovered, but also deliver conditions under which these can be provably identified from Random Forest predictions.
Methodology and Theoretical Foundation
LocalLSSFind generalizes prior global, signed interaction-detection methods (iRF and LSSFind) to the local case. For a specific test point x∗, LocalLSSFind traverses each tree in the RF, extracts the path taken by x∗, and records "signed features"—feature/direction pairs corresponding to the variable and split direction (leq or greater than) at each node. Only splits with impurity decrease above a threshold ϵ are considered, and only the first occurrence of a feature is tracked per path.
The method combines two prevalence statistics:
- Global, depth-weighted prevalence (DWP): Probability a signed feature set appears on a random path, weighted by path length (2−d).
- Local path prevalence (PP): Probability the set appears specifically on the paths traversed by the test point across the ensemble.
LocalLSSFind returns all signed interactions for which both DWP and PP exceed data-driven thresholds, applying an intersection/minimality filter for interactions of variable order.
The principal theoretical contribution is a suite of asymptotic consistency theorems. Under natural regularity conditions (bounded response, uniform features, disjoint interactions, path depth tending to infinity, no bootstrapping), the method is shown to recover exactly the basic signed interactions (BSIs) relevant for x∗. That is, as n→∞, the set of interactions output by LocalLSSFind converges in probability to the true signed drivers for each prediction.
Numerical and Empirical Results
The theoretical results are corroborated by comprehensive simulation studies:
- LSS-Generated Data Recovery: Across a variety of signal strengths, numbers of interactions, and interaction orders, LocalLSSFind consistently achieves high accuracy in recovering local BSIs—even when the feature space is large and the sample size modest. Signed interaction PP values for true local signal are highly concentrated near 1, while for irrelevant interactions they are close to zero.

Figure 1: Relative frequency with which BSIs for the test point are included in the top 10 interactions according to the considered metrics.
Real-World Application: COMPAS Risk Scoring
The method is applied to the COMPAS dataset for violent recidivism risk, using age, number of prior offenses, felony status, ethnicity, and gender as features. LocalLSSFind reveals that, while globally the number of priors and age dominate, on a per-individual level, combinations and directions (e.g., young age + high priors or African-American ethnicity) underlie high-risk predictions.
- The local importance of age and priors is not uniformly expressed: for one individual, a moderate priors count combined with young age sharply amplifies predicted risk, demonstrating genuine local interaction detected by the method.
Figure 3: Interaction map showing pairwise signed feature interaction scores. Each point represents a signed interaction, with size reflecting DWP importance. Age–priors signed interactions dominate pairwise importance.
- Further, the relationship between the local interaction importance for specific feature pairs and predicted scores is visualized and stratified by ethnicity, highlighting how directionality and magnitude interact with sensitive attributes.
Figure 4: Relationship between local interaction importance ($\PII$) for age–priors and predicted risk, stratified by ethnicity; individuals with matched global risk but distinct interaction contributions are highlighted.
Theoretical and Practical Implications
This work establishes, for the first time, RF-specific, prediction-level guarantees for the recovery of true local signed features and interactions. While functional decompositions (e.g., SHAP) have clear connections to model predictions, they do not, in general, agree with actual signal structure when the goal is to recover causal or meaningful data-generating mechanisms at the individual prediction level. In contrast, LocalLSSFind, under the LSS assumption, delivers interpretable signed patterns with provable alignment to the underlying causal components—supporting rigorous explanation, auditing, and individualized decision support tasks.
The practical utility is multi-faceted:
- Personalized explanations: The ability to isolate locally active directions and their combinations supports personalized, actionable, and trustworthy model explanations.
- Auditability and fairness: Disentangling which features or signed interactions drive high-risk predictions aids in detecting and reasoning about potential sources of bias or disparate impact.
- Causal inference and domain insight: Even though the method does not claim causal recovery, alignment with the LSS generative paradigm provides insight into the types of interactions a model actually exploits for specific samples.
Limitations and Future Directions
The theoretical analysis is fundamentally linked to the strict LSS model. While this model mirrors real phenomena in genomics and risk scoring—where thresholded or discontinuous feature combinations are important—it may not encompass the full range of non-linear or smooth interactions encountered in other domains. Extension of the framework to more general, possibly continuous, function classes, or to robustly handle correlated/noisy features, remains open.
Further, while the methodology is currently RF-specific (exploiting the structure of decision path co-occurrence), analogous consistency principles for gradient boosting, deep forests, or other non-tree ensemble models are of substantial interest. Finally, the deployment at interactive scale (addressing computational complexity for massive forests or real-time requests) would enhance applicability in critical domains.
Conclusion
This work provides the first local, RF-specific, theoretically sound approach to signed feature and interaction importance. LocalLSSFind offers actionable, mathematically justified insights at the individual prediction level, enabling precision interpretability, model auditing, and advanced downstream analysis within high-dimensional, complex supervised learning pipelines.