Double Cross-fit Doubly Robust Estimators: Beyond Series Regression
Abstract: Doubly robust estimators with cross-fitting have gained popularity in causal inference due to their favorable structure-agnostic error guarantees. However, when additional structure, such as H\"{o}lder smoothness, is available then more accurate "double cross-fit doubly robust" (DCDR) estimators can be constructed by splitting the training data and undersmoothing nuisance function estimators on independent samples. We study a DCDR estimator of the Expected Conditional Covariance, a functional of interest in causal inference and conditional independence testing. We first provide a structure-agnostic error analysis for the DCDR estimator with no assumptions on the nuisance functions or their estimators. Then, assuming the nuisance functions are H\"{o}lder smooth, but without assuming knowledge of the true smoothness level or the covariate density, we establish that DCDR estimators with several linear smoothers are $\sqrt{n}$-consistent and asymptotically normal under minimal conditions and achieve fast convergence rates in the non-$\sqrt{n}$ regime. When the covariate density and smoothnesses are known, we propose a minimax rate-optimal DCDR estimator based on undersmoothed kernel regression. Moreover, we show an undersmoothed DCDR estimator satisfies a slower-than-$\sqrt{n}$ central limit theorem, and that inference is possible even in the non-$\sqrt{n}$ regime. Finally, we support our theoretical results with simulations, providing intuition for double cross-fitting and undersmoothing, demonstrating where our estimator achieves $\sqrt{n}$-consistency while the usual "single cross-fit" estimator fails, and illustrating asymptotic normality for the undersmoothed DCDR estimator.
- The fundamental limits of structure-agnostic functional estimation. arXiv preprint arXiv:2305.04116, 2023.
- Some new asymptotic theory for least squares series: Pointwise and uniform results. Journal of Econometrics, 186(2):345–366, 2015.
- The berry-esseen bound for student’s statistic. The Annals of Probability, 24(1):491–503, 1996.
- Lectures on the nearest neighbor method. Cham: Springer, 2015.
- Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1):C1–C68, 2018.
- David R Cox. A note on data-splitting for the evaluation of significance levels. Biometrika, 62(2):441–444, 1975.
- Nearest Neighbor Classification and Search, chapter 18, pages 403–423. Cambridge University Press, Cambridge, 2021.
- Decoupling of u-statistics and u-processes. Decoupling: From Dependence to Independence, pages 97–152, 1999.
- High-dimensional inference: confidence intervals, p-values and r-software hdi. Statistical science, pages 533–558, 2015.
- Iván Díaz. Non-agency interventions for causal mediation in the presence of intermediate confounding. arXiv preprint arXiv:2205.08000, 2023.
- Rick Durrett. Probability: theory and examples. Cambridge university press, Cambridge, UK; New York, NY, 2019.
- Local polynomial modelling and its applications. Routledge, New York, NY, 2018.
- Three-way cross-fitting and pseudo-outcome regression for estimation of conditional effects and other linear functionals. arXiv preprint arXiv:2306.07230, 2023.
- A simple adaptive estimator of the integrated square of a density. Bernoulli, 14(1), 2008a.
- Uniform central limit theorems for kernel density estimators. Probability Theory and Related Fields, 141(3-4):333–387, 2008b.
- Mathematical foundations of infinite-dimensional statistical models. Cambridge university press, Cambridge, UK, 2021.
- Exponential and moment inequalities for u-statistics. In High Dimensional Probability II, pages 13–38. Springer, Boston, MA, 2000.
- A distribution-free theory of nonparametric regression, volume 1. New York: Springer, 2002.
- Bruce E. Hansen. Econometrics. Princeton University Press, Princeton, NJ, 2022.
- John A Hartigan. Using subsample values as typical values. Journal of the American Statistical Association, 64(328):1303–1317, 1969.
- Edward H Kennedy. Semiparametric doubly robust targeted double machine learning: a review. arXiv preprint arXiv:2203.06469, 2022.
- Edward H Kennedy. Towards optimal doubly robust estimation of heterogeneous causal effects. Electronic Journal of Statistics, 17(2):3008–3049, 2023.
- Dimension-agnostic inference using cross u-statistics. Bernoulli, 30(1):683–711, 2024.
- Higher order inference on a treatment effect under low regularity conditions. Statistics & Probability Letters, 81(7):821–828, 2011.
- New n𝑛\sqrt{n}square-root start_ARG italic_n end_ARG-consistent, numerically stable higher-order influence function estimators. arXiv preprint arXiv:2302.08097, 2023.
- On Nearly Assumption-Free Tests of Nominal Confidence Interval Coverage for Causal Parameters Estimated by Machine Learning. Statistical Science, 35(3):518–539, 2020.
- Adaptive estimation of nonparametric functionals. The Journal of Machine Learning Research, 22(1):4507–4572, 2021.
- Elias Masry. Multivariate regression estimation local polynomial fitting for time series. Stochastic Processes and their Applications, 65(1):81–101, 1996.
- Nonparametric estimation of conditional incremental effects. arXiv preprint arXiv:2212.03578, 2022.
- On undersmoothing and sample splitting for estimating a doubly robust functional. arXiv preprint arXiv:2212.14857, 2022.
- Stability selection. Journal of the Royal Statistical Society Series B: Statistical Methodology, 72(4):417–473, 2010.
- Patrick AP Moran. Dividing a sample into two parts a statistical dilemma. Sankhyā: The Indian Journal of Statistics, Series A, pages 329–333, 1973.
- Cross-fitting and fast remainder rates for semiparametric estimation. arXiv preprint arXiv:1801.09138, 2018.
- Undersmoothing and bias corrected functional estimation. 1998.
- Undersmoothed kernel entropy estimators. IEEE Transactions on Information Theory, 54(9):4384–4388, 2008.
- Bootstrapping and sample splitting for high-dimensional, assumption-lean inference. The Annals of Statistics, 47(6):3438–3469, 2019.
- Higher order influence functions and minimax estimation of nonlinear functionals. In Institute of Mathematical Statistics Collections, pages 335–421. Institute of Mathematical Statistics, 2008.
- Semiparametric minimax rates. Electronic Journal of Statistics, 3:1305–1321, 2009.
- Asymptotic normality of quadratic estimators. Stochastic processes and their applications, 126(12):3733–3759, 2016.
- Minimax estimation of a functional on a structured high-dimensional model. The Annals of Statistics, 45(5), 2017.
- Characterization of parameters with a mixed bias property. Biometrika, 108(1):231–238, 2021.
- D. Ruppert and M. P. Wand. Multivariate locally weighted least squares regression. The Annals of Statistics, 22(3):1346–1370, 1994.
- David W Scott. Multivariate density estimation: theory, practice, and visualization. John Wiley & Sons, Hoboken, NJ, 2015.
- The hardness of conditional independence testing and the generalised covariance measure. The Annals of Statistics, 48(3):1514–1538, 2020.
- Joel A Tropp. An introduction to matrix concentration inequalities. Foundations and Trends in Machine Learning, 8(1-2):1–230, 2015.
- Anastasios A Tsiatis. Semiparametric Theory and Missing Data. New York: Springer, 2006.
- Alexandre B Tsybakov. Introduction to Nonparametric Estimation. New York: Springer, 2009.
- Mark J van der Laan and James M Robins. Unified methods for censored longitudinal data and causality. New York: Springer, 2003.
- Efficient estimation of pathwise differentiable target parameters with the undersmoothed highly adaptive lasso. The International Journal of Biostatistics, 2022.
- Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Series in Statistics. New York: Springer, 2011.
- Aad van der Vaart. Higher Order Tangent Spaces and Influence Functions. Statistical Science, 29(4):679–686, 2014.
- Aad W van der Vaart and Jon A Wellner. Weak Convergence and Empirical Processes. New York: Springer, 1996.
- High dimensional variable selection. Annals of statistics, 37(5A):2178, 2009.
- Wenjing Zheng and Mark J van der Laan. Asymptotic theory for cross-validated targeted maximum likelihood estimation. U.C. Berkeley Division of Biostatistics Working Paper Series, 2010.
- Marginal interventional effects. arXiv preprint arXiv:2206.10717, 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.