Two-stage Estimators for Spatial Confounding with Point-Referenced Data
Abstract: Public health data are often spatially dependent, but standard spatial regression methods can suffer from bias and invalid inference when the independent variable is associated with spatially-correlated residuals. This could occur if, for example, there is an unmeasured environmental contaminant associated with the independent and outcome variables in a spatial regression analysis. Geoadditive structural equation modeling (gSEM), in which an estimated spatial trend is removed from both the explanatory and response variables before estimating the parameters of interest, has previously been proposed as a solution, but there has been little investigation of gSEM's properties with point-referenced data. We link gSEM to results on double machine learning and semiparametric regression based on two-stage procedures. We propose using these semiparametric estimators for spatial regression using Gaussian processes with Mat`ern covariance to estimate the spatial trends, and term this class of estimators Double Spatial Regression (DSR). We derive regularity conditions for root-$n$ asymptotic normality and consistency and closed-form variance estimation, and show that in simulations where standard spatial regression estimators are highly biased and have poor coverage, DSR can mitigate bias more effectively than competitors and obtain nominal coverage.
- Andrews, D. W. K. (1994) Asymptotics for semiparametric econometric models via stochastic equicontinuity. Econometrica, 62, 43–72. URL http://www.jstor.org/stable/2951475.
- Econometrica, 85, 233–298. URL http://www.jstor.org/stable/44155422.
- The Econometrics Journal, 21, C1–C68. URL https://doi.org/10.1111/ectj.12097.
- The Annals of Applied Statistics, 4, 266 – 298. URL https://doi.org/10.1214/09-AOAS285.
- International Journal of Epidemiology, 22, 1193–1202. URL https://doi.org/10.1093/ije/22.6.1193.
- Cressie, N. A. (1993) Statistics for Spatial Data. Wiley, 2 edn.
- Dorie, V. (2024) dbarts: Discrete Bayesian Additive Regression Trees Sampler. URL https://CRAN.R-project.org/package=dbarts. R package version 0.9-26.
- arXiv preprint arXiv:2309.16861. URL https://arxiv.org/abs/2309.16861.
- Biometrics, 78, 1279–1290. URL https://onlinelibrary.wiley.com/doi/abs/10.1111/biom.13656.
- Electronic Journal of Statistics, 7, 1 – 42. URL https://doi.org/10.1214/12-EJS760.
- arXiv preprint arXiv:2403.14385. URL https://arxiv.org/abs/2403.14385.
- arXiv preprint arXiv:2112.14946. URL https://arxiv.org/abs/2112.14946.
- arXiv preprint arXiv:2308.12181. URL https://arxiv.org/abs/2308.12181.
- Biometrika, 110, 699–719. URL https://doi.org/10.1093/biomet/asac069.
- Guinness, J. (2018) Permutation and grouping methods for sharpening gaussian process approximations. Technometrics, 60, 415–429. URL https://doi.org/10.1080/00401706.2018.1437476. PMID: 31447491.
- The American Statistician, 64, 325–334. URL https://doi.org/10.1198/tast.2010.10052.
- Journal of the American Statistical Association, 0, 1–11. URL https://doi.org/10.1080/01621459.2022.2147531.
- arXiv preprint arXiv:1807.02582. URL https://arxiv.org/abs/1807.02582.
- SIAM/ASA Journal on Uncertainty Quantification, 8, 926–958. URL https://doi.org/10.1137/20M1315968.
- arXiv preprint arXiv:2301.05743. URL https://arxiv.org/abs/2301.05743.
- Environmetrics, 33, e2727. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/env.2727.
- Paciorek, C. J. (2010) The importance of scale for spatial-confounding bias and precision of spatial regression estimators. Statistical Science, 25, 107–125. URL http://www.jstor.org/stable/41059000.
- R Core Team (2023) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
- The MIT Press. URL https://doi.org/10.7551/mitpress/3206.001.0001.
- Biometrics, 62, 1197–1206. URL http://www.jstor.org/stable/4124542.
- URL https://CRAN.R-project.org/package=geoR. R package version 1.9-3.
- Rice, J. (1986) Convergence rates for partially splined models. Statistics & Probability Letters, 4, 203–208. URL https://www.sciencedirect.com/science/article/pii/0167715286900672.
- The Annals of Statistics, 45, 1951–1987. URL http://www.jstor.org/stable/26362891.
- Robinson, P. M. (1988) Root-n-consistent semiparametric regression. Econometrica, 56, 931–954. URL http://www.jstor.org/stable/1912705.
- Journal of Statistical Software, 63, 1–25. URL https://www.jstatsoft.org/index.php/jss/article/view/v063i08.
- The Annals of Applied Statistics, 14, 2069 – 2095. URL https://doi.org/10.1214/20-AOAS1377.
- Stein, M. (1999) Interpolation of Spatial Data: Some Theory for Kriging. Springer Series in Statistics. Springer New York. URL https://books.google.com/books?id=5n_XuL2Wx1EC.
- The American Statistician, 72, 239–252. URL https://doi.org/10.1080/00031305.2017.1305290.
- Vecchia, A. V. (1988) Estimation and model identification for continuous spatial processes. Journal of the Royal Statistical Society. Series B (Methodological), 50, 297–312. URL http://www.jstor.org/stable/2345768.
- Wood, S. N. (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73, 3–36. URL https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-9868.2010.00749.x.
- — (2017) Generalized additive models: an introduction with R, Second Edition. CRC press.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.