- The paper introduces a novel estimator—CFFE—that uses node-level residualization to control for unit and time fixed effects in panel data.
- Simulation results demonstrate robust performance with an RMSE of 0.3776 and a CATE correlation of 0.9338, outperforming standard causal forests.
- The causalfe package adopts scikit-learn conventions for fitting and prediction, enabling practical nonparametric policy evaluation and subgroup analysis.
Causal Forests with Fixed Effects in Python: Implementation and Evaluation
Introduction
The "causalfe: Causal Forests with Fixed Effects in Python" (2601.10555) paper introduces a Python package—causalfe—implementing Causal Forests with Fixed Effects (CFFE) for estimating heterogeneous treatment effects in panel data scenarios. The core statistical innovation addresses persistent confounding resulting from unit and time fixed effects, which cause standard causal forests to attribute spurious heterogeneity to observed covariates. By introducing node-level orthogonalization, the CFFE estimator enables nonparametric identification of conditional average treatment effects (CATE) while controlling for nuisance parameters intrinsic to such panel structures.
Methodological Contributions
Panel Data and the Problem of Fixed Effects
The paper formalizes the panel data model as Yit​=αi​+γt​+τ(Xit​)Dit​+εit​ where αi​ and γt​ reflect unit and time fixed effects, Dit​ is a treatment indicator, and τ(Xit​) denotes covariate-dependent treatment heterogeneity. In common DiD applications, naive causal forest approaches induce bias—splits may operate on covariates correlated with fixed effects, leading to inflated apparent heterogeneity in CATE estimation. Global residualization fails because differences in fixed effects persist locally within tree leaves.
Causal Forests with Node-Level Residualization
CFFE circumvents these issues by implementing residualization of fixed effects within every tree node during both splitting and estimation. Using an alternating demeaning algorithm, each node computes unit- and time-demeaned outcomes and treatments. Splits are selected to maximize the squared difference in node-level treatment effects, with partition balancing enforced. Treatment effect estimation inside each node leverages an IV-style approach:
τ^L​=∑(i,t)∈L​D~it2​∑(i,t)∈L​D~it​Y~it​​
Honest estimation—separating structure and estimation samples at the unit level—ensures valid inferential properties. Cluster-aware subsampling accommodates the panel structure for both splitting and variance estimation.
Implementation: CFFEForest
The causalfe Python package provides a CFFEForest class with familiar scikit-learn conventions. Its fit method ingests covariates, outcomes, treatments, and panel identifiers; the predict and predict_interval methods deliver CATE predictions and associated confidence intervals. The half-sample variance approach from [Wager & Athey, 2018] is adopted, with caveats regarding conservative inference.
Simulation Results: Recovery and Identification
Simulated panel data—heterogeneous DiD settings with N=200, T=6, and treatment effects τ(x)=x1​—illustrate estimator accuracy. The CFFE model achieves RMSE of 0.3776 and a CATE correlation of 0.9338 with true effects, confirming robust heterogeneity detection.
Figure 1: CFFE estimation results. Panel (a) shows the distribution of estimated CATEs (blue) overlaid with true effects (coral). Panel (b) plots estimated against true CATEs; the dashed line indicates perfect prediction.
Comparative Analysis: Standard versus CFFE Forests
When fixed effects are correlated with covariates, standard causal forests demonstrate upward-biased CATE estimation, as evidenced by systematic deviation from the identity line. CFFE, through node-level residualization, eliminates confounding and exhibits lower RMSE despite comparable ranking ability.
Figure 2: Estimated versus true CATEs for CFFE (left) and standard causal forest (right) when fixed effects are correlated with covariates. The dashed line indicates perfect prediction. CFFE produces less biased estimates by residualizing fixed effects within each tree node.
Monte Carlo DGPs: Validity Assessment
Across placebo, homogeneous, and heterogeneous DGPs, mean treatment effect estimates show minimal bias for CFFE. Confidence interval coverage rates, however, are consistently below the nominal 95% level, reflecting conservative uncertainty quantification with the half-sample estimator.
Empirical Application: Minimum Wage Effects
Application to county-level panel data for minimum wage policy, following [Callaway & Sant’Anna, 2021], aligns CFFE point estimates (−0.042) with TWFE and group-time ATT results. The CFFE approach further elucidates cross-sectional heterogeneity, with CATEs distributed from −0.10 to +0.15 among treated counties.
Figure 3: Minimum wage application results. Left: Event study showing ATT by years relative to treatment. Right: Distribution of estimated CATEs for treated observations.
Implications and Future Directions
The presented methodology and software provide a rigorous nonparametric tool for panel-based causal inference, supporting fine-grained policy evaluation and subgroup analysis in economics and social sciences. The CFFE approach is especially pertinent for settings with complicated fixed effect structures, staggered treatment adoption, and substantial cross-sectional variation.
Practical limitations reside in the conservative nature of the half-sample variance approach, computational overhead for large panels, and the current lack of bootstrap or parallelization support. Methodological extensions may include improved variance estimation schemes, support for unbalanced panels, and distributed tree construction.
Theoretically, node-level orthogonalization within randomized forests may inspire related strategies for nuisance parameter control in other nonparametric causal machine learning architectures.
Conclusion
The causalfe Python package operationalizes node-level residualization via Causal Forests with Fixed Effects, equipping applied panel data research with robust, nonparametric heterogeneous treatment effect estimation. Empirical and simulation evidence confirm superior bias properties relative to standard causal forests. Given growing demand for flexible causal inference in complex policy datasets, further refinements in uncertainty quantification and scalability remain important future directions.