Penalized GMM Framework for Inference on Functionals of Nonparametric Instrumental Variable Estimators

Published 31 Mar 2026 in econ.EM and stat.ML | (2603.29889v1)

Abstract: This paper develops a penalized GMM (PGMM) framework for automatic debiased inference on functionals of nonparametric instrumental variable estimators. We derive convergence rates for the PGMM estimator and provide conditions for root-n consistency and asymptotic normality of debiased functional estimates, covering both linear and nonlinear functionals. Monte Carlo experiments on average derivative show that the PGMM-based debiased estimator performs on par with the analytical debiased estimator that uses the known closed-form Riesz representer, achieving 90-96% coverage while the plug-in estimator falls below 5%. We apply our procedure to estimate mean own-price elasticities in a semiparametric demand model for differentiated products. Simulations confirm near-nominal coverage while the plug-in severely undercovers. Applied to IRI scanner data on carbonated beverages, debiased semiparametric estimates are approximately 20% more elastic compared to the logit benchmark, and debiasing corrections are heterogeneous across products, ranging from negligible to several times the standard error.

Abstract PDF Upgrade to Chat

Authors (1)

Edvard Bakhitov

Summary

The paper introduces a penalized GMM estimator that automatically debiases nonparametric IV estimates by estimating the Riesz representer.
It achieves valid inference for both linear and nonlinear functionals using cross-fitting and regularized high-dimensional techniques.
Empirical results highlight improved numerical stability and more accurate confidence intervals, particularly in high-dimensional, ill-posed settings.

Penalized GMM Inference for Functionals of Nonparametric IV Estimators

Introduction and Motivation

This paper introduces a penalized GMM (PGMM) approach for inference on general functionals of nonparametric instrumental variable (NPIV) estimators, focusing on both linear and nonlinear functionals in high-dimensional and ill-posed settings with endogeneity. The main technical contribution is a PGMM estimator for the Riesz representer required for automatic debiasing of regularized ML-based NPIV estimators. The framework is motivated by the inadequacy of plug-in MLIV approaches for valid inference due to first-order regularization bias, and by the increasing prominence of ML instrumental variable (MLIV) estimators, which balance flexibility against ill-posedness through regularization and model selection.

The proposed PGMM-based automatic debiasing mechanism is structurally insensitive to the form of the functional considered, generalizes the Lasso minimum-distance Riesz estimator, and is shown to be the only non-minimax automatic RR estimator in the NPIV context.

Statistical Framework

The canonical NPIV model is

$Y = \gamma_0(X) + \varepsilon, \quad \mathbb{E}[\varepsilon \mid Z] = 0,$

where $X$ may be high-dimensional and endogenous, $Z$ are instruments, and the object of interest is a functional $\theta_0 = \mathbb{E}[m(W, \gamma_0)]$ . The ill-posed nature of the problem is highlighted: estimation of $\gamma_0$ amounts to inversion of a compact linear operator, which is highly sensitive to errors, hence regularization is inevitable in practice.

When the inferential target is a functional, the regularization bias propagates nontrivially: plug-in estimators for $\theta_0$ constructed from MLIV $\hat{\gamma}$ display severe coverage distortions, since the regularization bias dominates the sampling variability even as $n\to\infty$ .

The Neyman-orthogonal influence function-based approach is adopted: the construction

$\psi(W, \theta, \gamma, \alpha) = m(W,\gamma) - \theta + \alpha(Z)[Y - \gamma(X)]$

eliminates first-order bias in $\hat{\theta}$ by inclusion of the suitable Riesz representer $X$ 0. This representation ensures local robustness (Neyman orthogonality) and admits root- $X$ 1 inference provided appropriate estimation rates for both $X$ 2 and the Riesz representer $X$ 3.

Penalized GMM for Automatic Riesz Representer Estimation

In contrast to parametric and analytic regularization approaches (e.g., explicit Lasso RR, neural nets, minimax strategies), the PGMM estimator for $X$ 4 is constructed by directly exploiting the population orthogonality implied by the linearity of $X$ 5: $X$ 6 Discretizing $X$ 7 with a basis $X$ 8 and approximating $X$ 9 within a high-dimensional basis $Z$ 0, the estimation is posed as a high-dimensional, over-parameterized GMM moment problem with an $Z$ 1 penalty: $Z$ 2 yielding $Z$ 3. The procedure accommodates $Z$ 4, with identification secured via a high-dimensional restricted eigenvalue condition. The orthogonality of the influence function motivates this construction and guarantees that the estimator does not require an analytical form for the RR—even for complex nonlinear or ill-posed problems.

Asymptotic Theory

Linear Functionals

Root- $Z$ 5 asymptotic normality holds for the debiased estimator $Z$ 6 if

$Z$ 7 converges in projected $Z$ 8 norm at rate $Z$ 9 (can be slow due to ill-posedness),
$\theta_0 = \mathbb{E}[m(W, \gamma_0)]$ 0 converges in $\theta_0 = \mathbb{E}[m(W, \gamma_0)]$ 1 at rate $\theta_0 = \mathbb{E}[m(W, \gamma_0)]$ 2 (with $\theta_0 = \mathbb{E}[m(W, \gamma_0)]$ 3 the RR's effective sparsity and suitable regularization),
$\theta_0 = \mathbb{E}[m(W, \gamma_0)]$ 4,
regular moment and design conditions are satisfied.

These conditions are highly permissive: projected mean square convergence for $\theta_0 = \mathbb{E}[m(W, \gamma_0)]$ 5 is attainable for a wide range of MLIV (Double Lasso, kernel IV, minimax, etc.), and the estimation error in the RR does not inflate sampling error beyond the orthogonalization margin.

Nonlinear Functionals

The extension to nonlinear functionals (e.g., average consumer surplus, own-price elasticity) requires Gateaux/Frechet differentiability of $\theta_0 = \mathbb{E}[m(W, \gamma_0)]$ 6, restricts attention to functionals for which the Riesz representation remains well-defined and linear in the perturbation direction, and, due to the lack of direct orthogonality, requires stronger convergence of $\theta_0 = \mathbb{E}[m(W, \gamma_0)]$ 7---fast enough in standard $\theta_0 = \mathbb{E}[m(W, \gamma_0)]$ 8 norm to ensure $\theta_0 = \mathbb{E}[m(W, \gamma_0)]$ 9, typically necessitating $\gamma_0$ 0.

All inference is performed via sample-splitting and cross-fitting—in particular, for nonlinear functionals, double cross-fitting is required to prevent overfitting in estimation of the RR moment system, since those moments depend on estimated $\gamma_0$ 1.

Empirical and Simulation Results

Multiple Monte Carlo experiments are provided, focusing primarily on the weighted average derivative and own-price elasticity functionals in ill-posed IV systems with moderate to high dimensions.

Key results: Plug-in estimators display severe undercoverage—empirical coverage of nominal 95% confidence intervals collapses quickly with increasing $\gamma_0$ 2, often to below 5%. Coverage failure is pronounced for functional targets—even at moderate sample sizes—and bias is non-negligible.
Debiased estimators (both analytical and PGMM): Achieve near-nominal coverage (90%--96%) across all regimes, with stable bias and variance.
Numerical stability: The automatic PGMM debiasing procedure exhibits stronger numerical stability and lower variance than analytical RR-based approaches, especially in small- $\gamma_0$ 3/high- $\gamma_0$ 4 regimes. This is attributed to the avoidance of analytic RR matrix inversions and improved conditioning.

In the context of semiparametric demand estimation for differentiated products using IRI scanner data, semiparametric (automatic debiased) own-price elasticities are approximately 20% more elastic (in magnitude) relative to parametric logit demand estimates. Importantly, the magnitude and sign of the debiasing correction is heterogenous across products—ranging from negligible for some SKUs (Stock Keeping Units) to multiples of the analytical standard error for others. This differential effect highlights the practical importance of automatic debiasing for valid inference in empirical IO.

Algorithmic and Computational Aspects

The PGMM optimization leverages coordinate descent with active set and adaptive/diagonal penalty loading variants. Cross-validated selection of the penalty parameter is adopted for stability in finite samples. Full details of efficient high-dimensional implementation, including active-set exploitation for computational gains, are developed and empirically benchmarked.

Theoretical and Practical Implications, Directions for Future Research

By integrating Neyman-orthogonal machinery, automatic RR estimation, penalized GMM, and modern MLIV estimators, this work provides a robust, theoretically justified, and computationally scalable framework for valid inference on functionals of nonparametric models under ill-posedness and endogeneity.

Theoretical implications:

This approach removes the obstacle of bias correction for functionals when explicit RR formulas are unavailable, thus generalizing debiased machine learning to the challenging NPIV/MLIV context.
Rates derived clarify the differing requirements for linear vs. nonlinear functionals, and point to the slowest admissible convergence for valid inference.

Practical implications:

In high-dimensional structural estimation (demand/IO, policy evaluation), using the proposed framework is essential for valid confidence intervals.
Heterogeneous debiasing corrections at the functional level indicate that failing to debias can result in severe misestimation of policy-relevant objects—especially in applied work that relies on plug-in ML methods.
The open-source implementation makes the approach easily applicable to empirical problems in modern econometric practice.

Speculation on future research:

Extension to irregular functionals and sup-norm inference.
Relaxing convergence constraints for nonlinear functional inference in ill-posed problems.
Seamless integration with even more sophisticated ML base-learners (e.g., deep nets, ensemble methods) within the RR estimation framework.

Conclusion

The penalized GMM framework for automatic debiasing of functionals of nonparametric IV estimators enables asymptotically valid, robust inference when modern regularized machine learning approaches are employed for high-dimensional, ill-posed problems. The method is algorithmically practical, statistically optimal under minimal conditions, and gives empirical evidence supporting its necessity over conventional plug-in alternatives in both synthetic and real economic data (2603.29889).

Markdown Report Issue