Counterfactual Upper-Bound Analysis
- Counterfactual upper-bound analysis is a method to determine the maximum effect of interventions in partially identified models by characterizing sharp bounds under uncertainty.
- It utilizes exact inference via credal networks and approximate causal EM algorithms to efficiently optimize counterfactual probabilities.
- This approach is pivotal in applications like palliative care, offering actionable insights when data or model uncertainty precludes precise counterfactual estimation.
Counterfactual upper-bound analysis refers to a rigorous methodology for computing the maximal values that counterfactual probabilities, expectations, or functional effects can attain, given uncertainty or partial identification in the underlying structural model, data, or estimation procedure. This paradigm is central in causal inference—whether in structural causal models (SCMs), econometric identification, bandit optimization, or algorithmic recourse—whenever the exact value of a counterfactual is not point-identified by the available data and/or assumptions, but well-defined bounds can be sharply characterized. The formulation, computation, and interpretation of such upper bounds depend crucially on the structure of the model (e.g., graph topology, unobserved variables), the type of information available (e.g., observational, interventional, or mixed datasets), and the application context, as detailed below.
1. Foundations: Partially Identified Counterfactuals in Structural Causal Models
Let be a structural causal model over discrete random variables partitioned as exogenous () and endogenous (), with deterministic structural equations and observational data on . A counterfactual quantity is typically of the form
where is an intervention, is the event of interest, and is evidence. In general, this counterfactual is partially identifiable: it depends on the unobserved distribution , and thus, only the set of all SCMs -compatible—denoted —is known. The counterfactual upper bound is then
and the lower bound is the analogous minimum. These bounds summarize the identifiability of the counterfactual in the absence of full knowledge of (Zaffalon et al., 2023).
2. Computational Methodologies: From SCMs to Credal Networks to Causal EM
Exact Inference via Credal Networks
For SCMs with known structural equations and observed sufficient marginals (Markovian or quasi-Markovian), there exists a mapping to an equivalent credal network. The distributional constraints induced by are translated into (possibly degenerate) credal sets for exogenous variables. In the Markovian case, the mapping is as follows: for each exogenous , let be its unique endogenous child, and let be the remaining parents of . One enforces
This results in a feasible set for : all distributions matching the observed marginals through the deterministic mechanisms.
Given a counterfactual functional , the upper bound is
Complexity and Inference Hardness
Causal inference—computing such bounds—is NP-hard even in polytree SCMs. This follows from the known NP-hardness of inference in credal networks with polytree structure, as every such network corresponds to an SCM of the same structure (Zaffalon et al., 2023).
Approximate Inference via Causal EM
When exact computation is infeasible, a causal EM algorithm (EMCC) is applied:
- E-step: For each and in data, compute posterior under current .
- M-step: Update by averaging posteriors over . This scheme is iterated to stationarity; multiple random restarts sample extreme points of the feasible set , and for each solution, the counterfactual is evaluated. The reported upper bound is the maximum value across EM solutions. The accuracy of the approximation is quantified via credible intervals explicitly derived as functions of the empirical range and number of restarts (Zaffalon et al., 2023).
3. Theoretical Guarantees and Properties
- Feasibility and Likelihood (M-compatibility): The feasible set is non-empty if and only if the log-likelihood of the data is maximized at some . This check is essential: if the constraints from and are incompatible, the counterfactual bound is ill-posed.
- Sharpness: The computed bounds are tight; no value outside is consistent with . For credal network representations, the set of compatible Bayesian networks is precisely the set of fully specified SCMs matched to the data (Zaffalon et al., 2023).
- Approximation Accuracy: Relative RMSE decays rapidly as the number of EM restarts increases (RRMSE with runs on synthetic benchmarks), and the method scales to models with up to 19 nodes and 256 exogenous cardinality.
4. Limitations and Model Requirements
- Necessity of Structural Equations: Knowledge of the explicit structural equations is required. Attempts to compute bounds solely from graphical structure or observed marginals may yield invalid intervals if the data are incompatible with plausible (Zaffalon et al., 2023).
- Dependence on Data Compatibility: M-compatibility must be checked; otherwise, the computational procedure can yield spurious “bounds” that do not correspond to any SCM.
- Canonical and Bound Tightness: Using "canonical" specification of (encoding all deterministic mechanisms) guarantees M-compatibility at large data, but the resulting bounds may be conservative, i.e., wider than necessary.
5. Empirical Applications: Case Study in Palliative Care
A practical application is presented for modeling place of death of terminal cancer patients:
- Graph includes patient/family awareness, home-assistance (“Triangolo”), symptoms, age, performance, and more.
- Interventions on home-assistance and awareness are considered, with home death as the effect variable.
- EMCC is run with restarts and about $500$ iterations per run.
- Resulting probability of necessity and sufficiency (PNS) intervals:
- Triangolo:
- Patient awareness:
- Family awareness:
- These indicate that expanding home-assistance has a much stronger counterfactual effect than communication-based interventions (Zaffalon et al., 2023).
6. Algorithmic and Practical Considerations
| Step | Approach | Remarks |
|---|---|---|
| Feasibility | Map to credal network / linear constraints | Markovian/quasi-Markovian distinction matters |
| Upper bound | Optimize counterfactual over feasible | Linear/nonlinear programming |
| Approximation | Multiple causal EM runs sampling | Range approximates bound |
| Quality control | RRMSE, credible intervals on [a, b] | Theoretical interval based on Beta model |
While exact algorithms are infeasible for large discrete models due to NP-hardness, the EM approach provides scalable and practical approximations with quantifiable error. The range of counterfactual estimates over multiple EM runs closely matches true bounds in synthetic and real datasets, such as the palliative care study.
7. Context and Broader Significance
Counterfactual upper-bound analysis generalizes naturally across distinct inferential paradigms: from linear programming for deterministic SCMs (Balke et al., 2013), to convex optimization in moment or -divergence neighborhoods (Christensen et al., 2019), to robust causal inference in empirical games with partially identified equilibria (Kline et al., 2024). The paradigm is essential wherever partial identifiability or model uncertainty renders point estimation of counterfactuals ill-posed. Within the core SCM literature, the sharpness, computational feasibility, and data requirements for such bounds remain crucial theoretical and methodological concerns (Zaffalon et al., 2023).