Misspecified Engle-Granger Test

Updated 30 December 2025

Misspecified Engle-Granger test is a methodological error where first differences of I(1) series are used instead of levels, falsely indicating cointegration.
This misapplication produces false-positive rates approaching 100% in simulation studies, undermining the validity of long-run equilibrium analysis.
Empirical studies in migration and labor market contexts reveal that using differenced data for cointegration tests leads to misleading policy implications.

A misspecified Engle–Granger test refers to the incorrect application of the Engle–Granger (EG) cointegration procedure, wherein the test is applied to first differences of I(1) series rather than their levels. This practice artificially guarantees a rejection of the null hypothesis of no cointegration, yielding spurious statistical evidence for long-run equilibrium relationships that do not exist. Recent critiques demonstrate that such misspecification results in false positive rates approaching 100 percent in simulation studies, undermining any subsequent inference concerning cointegration in macroeconomic or migration data (Rodríguez et al., 24 Dec 2025, Rodriguez et al., 24 Dec 2025).

1. Standard Engle–Granger Procedure

The canonical Engle–Granger two-step procedure is designed to assess cointegration between two I(1) time series $\{x_t\}$ and $\{y_t\}$ :

Stage 1 (levels regression): Estimate $y_t = \alpha + \beta x_t + u_t$ via OLS, obtaining residuals $\hat{u}_t = y_t - \hat{\alpha} - \hat{\beta} x_t$ .
Stage 2 (ADF test): Test the residuals for a unit root:

$\Delta \hat{u}_t = \phi \hat{u}_{t-1} + \sum_{i=1}^p \gamma_i \Delta \hat{u}_{t-i} + \epsilon_t$

The null hypothesis $H_0: \phi=0$ (no cointegration), alternative $H_1: \phi<0$ (cointegration).

The test statistic $\tau = \hat{\phi}/SE(\hat{\phi})$ follows a nonstandard distribution under $H_0$ ; critical values are provided by MacKinnon (2010), e.g., $\tau_{5\%} \approx -3.47$ , $\tau_{1\%} \approx -4.15$ for models without trend or intercept (Rodríguez et al., 24 Dec 2025).

2. Theoretical Basis for Cointegration Testing in Levels

Cointegration tests are meaningful only when applied to the levels of I(1) series. If $x_t$ and $y_t$ are I(1) but a linear combination $u_t = y_t - \alpha - \beta x_t$ is I(0), then $x_t$ and $y_t$ are said to be cointegrated. The stationarity of $u_t$ implies a long-run equilibrium relationship.

I(1): A process $z_t$ is integrated of order one if $\Delta z_t$ is I(0) (stationary).
I(0): Weakly stationary (mean, variance, autocovariances constant).

Testing for cointegration among first differences ( $\Delta x_t$ , $\Delta y_t$ ) is invalid because these series are I(0) by construction. Any regression between I(0) series produces I(0) residuals, so the cointegration test, when misapplied to differences, will always indicate cointegration regardless of true data-generating process (Rodríguez et al., 24 Dec 2025).

3. Consequences of Misspecification

Misspecification arises from first-differencing I(1) series before cointegration testing:

Misapplied Steps:

Regress $\Delta y_t$ on $\Delta x_t$ : $\Delta y_t = \tilde{\alpha} + \tilde{\beta} \Delta x_t + v_t$
Test $v_t$ for a unit root via ADF: $\Delta v_t = \gamma v_{t-1} + \sum_{i=1}^p \phi_i \Delta v_{t-i} + e_t$

False Positives:

Under general conditions, the residual $v_t$ is stationary (I(0)), so the subsequent ADF test always rejects the null of no cointegration. Monte Carlo simulations using independent random walks confirm a 100% empirical rejection rate for the misspecified test, compared to the nominal 5% rate for the correctly specified test (Rodríguez et al., 24 Dec 2025). Table 1 summarizes these rates:

Test Specification	Mean EG Statistic (τ)	Rejection Rate (5%)
EG on levels (correct)	–2.078 (0.843)	5.3%
EG on first differences	–6.956 (1.055)	100%

Figure 1 in Rodríguez–Bravo illustrates dramatically shifted distributions for $\tau$ in the misspecified test.

4. Empirical Examples: Migration and Labor Market Applications

Case studies in recent literature have exposed widespread consequences of misspecified Engle–Granger tests:

Bahar and Hausmann (2025): The apparent cointegration between Venezuelan oil revenues and migration flows results from testing cointegration using first differences. Application to correctly specified levels of the logged variables fails to reject the null of no cointegration in all variants of the test. Consequently, subsequent long-run and error-correction estimations lack foundation (Rodríguez et al., 24 Dec 2025).
Bahar (2025): Cointegration between US job vacancies and Southwest border crossings is similarly artifactual, created by applying EG tests to differenced series. Replication of the correct EG procedure on levels yields test statistics between –1.85 and –4.06, with only one in twelve specifications rejecting at the 5% threshold; most fail to reject (Rodriguez et al., 24 Dec 2025). The entire approach to estimating short- and long-run elasticities is uninformative absent genuine cointegration.

5. Methodological Implications

The prevalence of misspecified cointegration testing has several implications for empirical practice:

Test for integration order: Always pre-test each series for I(0) versus I(1).
Apply cointegration tests to levels: Cointegration frameworks (EG, Johansen) should only be applied to I(1) levels, not differences.
Avoid pre-differencing: Differencing before cointegration testing induces spurious findings.
Post-cointegration estimation: Upon finding genuine cointegration, estimate the long-run vector using bias-corrected estimators (FM-OLS, DOLS).
Include deterministic terms: Add drift, trend, or seasonal dummies as justified by theory.
Critical values: Employ appropriate critical values (e.g., MacKinnon tables), matching test specification.

A plausible implication is that results from models premised on spurious cointegration cannot be trusted for policy analysis, as short-run and long-run elasticities derived from misspecified regressions have no equilibrium interpretation.

6. Controversy and Correction in Recent Literature

The misapplication of the Engle–Granger test has led to significant reversals of claimed relationships in the literature on migration and macroeconomic flows. Notably, Bahar and Hausmann’s core findings regarding the linkage of Venezuelan oil revenues with US migration, and Bahar’s estimated labor-market effects on migration, are invalidated on methodological grounds (Rodríguez et al., 24 Dec 2025, Rodriguez et al., 24 Dec 2025). In both cases, the absence of cointegration in the levels of the series nullifies policy claims about equilibrium relationships and adjustment mechanisms. This suggests a need for heightened diagnostic rigor and re-evaluation of empirical strategies in time-series econometrics.

7. Summary Table: Correct vs. Misspecified Engle–Granger Procedure

Step	Correct EG Test (Levels)	Misspecified EG Test (First Differences)
Regression	$Y_t = \alpha + \beta X_t + u_t$	$\Delta Y_t = \alpha + \beta \Delta X_t + v_t$
Residual Unit Root	$\Delta u_t = \gamma u_{t-1} + \cdots$	$\Delta v_t = \gamma v_{t-1} + \cdots$
Statistical Outcome	Cointegration only if $u_t$ is I(0)	Always finds “cointegration”

The misspecified Engle–Granger test on first differences manifests perfectly spurious rates of cointegration, nullifying any substantive interpretation of cointegrating behavior or long-run elasticities in empirical applications.

Markdown Report Issue Upgrade to Chat

References (2)

Why Bahar and Hausmann Tell Us Nothing About Venezuelan Migration Flows to the United States (2025)

US labor market conditions and migration: a reassessment of Bahar (2025) (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Misspecified Engle-Granger Test.