Data-Dependent Critical Value Function
- Data-dependent critical value functions are adaptive thresholds that replace fixed constants with data-specific values to improve hypothesis testing.
- They utilize methods like linear programming, density estimation, and table interpolation to ensure precise size control under diverse conditions.
- Applications include persistent time-series, weak instrumental variable scenarios, and nonparametric dependence tests to boost power and robustness.
A data-dependent critical value function (CVF) is a functional approach in hypothesis testing which replaces the traditional use of a fixed critical constant with a random, data-dependent threshold. The test statistic is compared to a critical value function —the value itself is constructed from the observed data and model-implied features—such that the rejection region adapts to the structure of nuisance parameters, identification strength, or local deviations from a null hypothesis. This methodology fundamentally enhances size control and power in settings where classic simulation or bootstrap-based critical values fail, especially under scenarios involving weak identification, high persistence, or complex dependence structures (Moreira et al., 2016, Hoekstra et al., 25 Jan 2026, Ćmiel et al., 23 Dec 2025).
1. Formal Definition and Construction of Data-Dependent Critical Value Functions
The CVF replaces the static critical threshold with a stochastic mapping , where denotes the observed sample. Formally, consider a scalar test statistic (e.g., -statistic), the test rejects if . The function is constructed using densities of maximal invariants parameterized across a grid of nuisance parameter values:
Weights are selected so that, at each grid point , the null rejection probability is the nominal level :
This paradigm also manifests in conditional tests for weakly identified parameters, where the critical value function is conditioned on observed eigenvalues from concentration matrices (GKM framework), or quantile-dependent critical surfaces for copula independence (Hoekstra et al., 25 Jan 2026, Ćmiel et al., 23 Dec 2025).
2. Theoretical Properties: Similarity and Invariance
The CVF methodology is grounded in the concept of approximately similar tests: For all nuisance values in the space , the test maintains the property:
This ensures size control even as nuisance parameters approach boundaries (e.g., unit root settings in persistence models). The critical value function leverages invariance principles, identifying maximal invariants and constructing rejection regions that depend solely on their distributions under , yielding tests that are admissible and similar (Moreira et al., 2016).
3. Numerical Algorithms: Linear Programming, Density Estimation, and Table Interpolation
The construction of the CVF entails numerical solutions, chiefly via linear programming (LP):
- LP Solution for CVF Weights: Simulate draws from the baseline density; at each, compute likelihood ratios for each ; solve the primal LP for maximal rejection subject to size constraints, and the dual LP for optimal weights .
- Critical Surfaces for Copula-Based Tests: For each point on the copula grid, estimate the sampling distribution of the local test statistic, and set upper and lower critical surfaces () by permutation-based quantiles or asymptotic normality (Ćmiel et al., 23 Dec 2025).
- Conditional Critical Values for AR Tests: Tabulate or interpolate the upper -quantiles of the smallest (or second-smallest) eigenvalues, using the conditional law derived from noncentral Wishart distributions; numerical root-finding is employed in the absence of explicit tables (Hoekstra et al., 25 Jan 2026).
4. Asymptotic and Finite-Sample Performance
In finite samples, the LP-based CVF achieves exact size at the grid points and near-uniform size elsewhere by continuity and Glivenko–Cantelli properties. Asymptotically, under local alternatives (e.g., ), the scaling function aligns with classical LAN, LABF, or LAMN regimes:
- For stationary or explosive regimes ( or ), the CVF reduces to the usual Gaussian quantile.
- At boundaries (, unit root), the LP weights converge to constants that solve boundary equations involving non-pivotal limit distributions, correctly adjusting size (Moreira et al., 2016).
In conditional critical value frameworks for subvector Anderson–Rubin tests, exact or near-exact size control is proven for all nuisance configurations. The new critical value function, conditioned on the second-smallest eigenvalue, is shown to be strictly more powerful and size-correct than the original GKM conditioning in settings with multiple nuisance parameters (Hoekstra et al., 25 Jan 2026).
5. Applications: Persistent Regressors, Weak Identification, and Dependence Detection
The CVF approach is particularly effective when traditional simulation-based critical values fail:
- Highly Persistent Regressors: In time-series models with , the standard -tests' size fails near . The CVF, through density mixing and adaptive size control, corrects for non-standard null distributions (Moreira et al., 2016).
- Instrumental Variables with Weak Instruments: In subvector AR tests, conditioning on the concentration matrix’s eigenvalues via critical value functions adaptively enhances power and maintains size in weak identification regimes. Usage extends to robustification under approximate Kronecker-product heteroskedasticity (Hoekstra et al., 25 Jan 2026).
- Nonparametric Dependence Detection: CVF-based critical surfaces (or "critical bands") for quantile-dependence functions provide localized hypothesis testing for independence in copula domains, enabling precise localization of dependencies and ensuring global significance control (Ćmiel et al., 23 Dec 2025).
6. Comparative Simulation and Practical Implementation
Simulation evidence demonstrates strict control over type I error:
- In persistent time-series, the CVF keeps null rejection probability within uniformly over a broad grid for , while resampling methods (bootstrap, subsampling) show substantial distortion near non-pivotal boundaries (up to 50% rejection) (Moreira et al., 2016).
- In instrumental variables applications, the new conditional critical value function, conditioned on , outperforms the GKM method in power whenever more than one nuisance parameter exists, while retaining (conditional and unconditional) size control (Hoekstra et al., 25 Jan 2026).
- For dependence testing using critical surfaces, the approach is consistent: local departures from independence are mapped via the quantile dependence function and detected with the region-specific critical band, maintaining overall size via Bonferroni-style adjustment (Ćmiel et al., 23 Dec 2025).
7. Extensions and Generalizations
Data-dependent CVFs can be generalized to more complex parameter spaces, multi-dimensional test statistics, and settings with complex dependence or heteroskedasticity. For instance, the Kronecker-product robust AR statistic inherits the power and size advantages of the conditional CVF when the critical value is adapted for multiple eigenvalue conditioning (Hoekstra et al., 25 Jan 2026). Quantile-dependent critical surfaces extend to high-dimensional copula testing, with simulation-based or asymptotic quantile construction for localized hypothesis testing (Ćmiel et al., 23 Dec 2025).
A plausible implication is that the CVF methodology may be further extensible to adaptive, high-dimensional inference, potentially motivating new theoretical development in non-standard limit theory and computational statistics.