Model-Agnostic Counterfactual Explanations for Consequential Decisions

Published 27 May 2019 in cs.LG, cs.AI, cs.LO, and stat.ML | (1905.11190v5)

Abstract: Predictive models are being increasingly used to support consequential decision making at the individual level in contexts such as pretrial bail and loan approval. As a result, there is increasing social and legal pressure to provide explanations that help the affected individuals not only to understand why a prediction was output, but also how to act to obtain a desired outcome. To this end, several works have proposed optimization-based methods to generate nearest counterfactual explanations. However, these methods are often restricted to a particular subset of models (e.g., decision trees or linear models) and differentiable distance functions. In contrast, we build on standard theory and tools from formal verification and propose a novel algorithm that solves a sequence of satisfiability problems, where both the distance function (objective) and predictive model (constraints) are represented as logic formulae. As shown by our experiments on real-world data, our algorithm is: i) model-agnostic ({non-}linear, {non-}differentiable, {non-}convex); ii) data-type-agnostic (heterogeneous features); iii) distance-agnostic ($\ell_0, \ell_1, \ell_\infty$, and combinations thereof); iv) able to generate plausible and diverse counterfactuals for any sample (i.e., 100% coverage); and v) at provably optimal distances.

Abstract PDF Upgrade to Chat

Citations (291)

View on Semantic Scholar

Summary

The paper presents MACE, which leverages formal verification to generate optimal counterfactual explanations with 100% coverage across various models.
It employs flexible distance metrics and accommodates heterogeneous feature spaces to ensure explanations remain plausible and diverse.
Empirical tests on datasets like Adult, Credit, and COMPAS demonstrate that MACE outperforms existing methods in delivering minimal, actionable recourse.

Insight into Model-Agnostic Counterfactual Explanations for Consequential Decisions

The paper "Model-Agnostic Counterfactual Explanations for Consequential Decisions" by Karimi et al. presents a novel approach, MACE (Model-Agnostic Counterfactual Explanations), for generating counterfactual explanations that adhere to the requirements of individuals subjected to automated decision-making systems. Given the increasing reliance on predictive models in consequential decisions such as pretrial bail, loan approval, and hiring processes, providing transparent decision rationales has become critical. This work stands out by addressing the limitations of prior methods which were confined to specific models and failed to guarantee coverage or provide optimal counterfactuals.

Methodology and Core Contributions

The proposed MACE methodology employs formal verification tools, specifically satisfiability modulo theories (SMT) solvers, to ensure the robustness and reliability of counterfactuals generated across a diverse range of models and datasets. MACE distinguishes itself by several key attributes:

Model-Agnosticism: It operates independently of the model specifics—whether linear or nonlinear, differentiable or not—making it a versatile solution applicable to decision trees, random forests, logistic regression, and multilayer perceptrons.
Distance Metrics: The method is flexible with respect to distance computations, capable of handling $\ell_0$ , $\ell_1$ , and $\ell_\infty$ norms. The inclusion of heterogeneous feature spaces, where inputs are represented by continuous, discrete, categorical, or ordinal data, is significant for real-world applicability.
Coverage and Optimality: MACE ensures 100% coverage, meaning it can provide a counterfactual explanation for any factual instance. It also guarantees the optimal distance for counterfactuals, ensuring minimal changes are required to flip the decision output.
Plausibility and Diversity: It incorporates additional constraints to maintain explanations within reasonable and actionable realms for users, thus preserving the semantic integrity of features involved (e.g., ensuring immutable features like gender are not altered). It further addresses the need for diverse counterfactuals, facilitating alternative actionable insights for end-users.

Empirical Validation

Extensive empirical validation on datasets pertinent to consequential decision-making (e.g., Adult, Credit, and COMPAS datasets) showcases MACE's superior performance. Compared to extant techniques such as Minimum Observable (MO), Feature Tweaking (FT), and Actionable Recourse (AR), MACE consistently achieved complete coverage with significantly closer counterfactuals, thus reducing the cognitive and logistical burden on individuals seeking to alter decision outcomes.

Implications and Future Directions

The versatility and robustness of MACE have important implications for fairness-aware machine learning and model interpretability. By providing transparent decision rationales, the framework could significantly impact legal and regulatory frameworks, such as those outlined by the EU General Data Protection Regulation (GDPR), which advocates for the right-to-explanation.

From a theoretical perspective, the method's reliance on formal verification tools bridges a critical gap between model interpretability and program verification, suggesting a fertile area for further research. Future work may include enhancing scalability for more complex models, extending support to multi-class classification and regression paradigms, and exploring richer notions of plausibility and diversity to enhance interpretability further.

In conclusion, MACE represents a significant step towards transparent and legally compliant automated decision-making systems, embedding the ethical and societal considerations central to the deployment of consequential AI systems. This paper provides a robust foundation for researchers and practitioners interested in developing fairer and more interpretable models.