Papers
Topics
Authors
Recent
Search
2000 character limit reached

Bahar–Hausmann Regressions: Analysis & Errors

Updated 30 December 2025
  • Bahar–Hausmann regressions are an empirical approach aimed at linking Venezuelan oil revenues to migration flows using cointegration techniques.
  • The method incorrectly applies the Engle–Granger test to first-differenced, rather than level, data, resulting in a 100% false positive rate in simulations.
  • Monte Carlo evidence and theoretical analysis underscore the importance of proper integration pre-testing and specification to ensure valid long-run inference.

Bahar–Hausmann regressions refer to a class of empirical strategy employed by Bahar and Hausmann to investigate long-run relationships between Venezuelan oil revenues and U.S. border encounters of Venezuelan nationals. The approach involves the application of cointegration techniques, specifically the Engle–Granger two-step method, to assess whether oil income and migration are linked in the long run. However, the defining feature—and principal flaw—of the Bahar–Hausmann implementation is the misapplication of the Engle–Granger test to first differences of the variables rather than their levels, an error that has significant implications for statistical inference and interpretation (Rodríguez et al., 24 Dec 2025).

1. Bahar–Hausmann Regression Framework

The empirical approach in question is motivated by a narrative that oil sanctions, by reducing Venezuelan oil income, could be expected to impact migration outflows, measurable via U.S. border encounters. The appropriate econometric specification for capturing a long-run equilibrium relationship in this context would involve a cointegrating regression in the levels (typically log-transformed) of the two series. For example,

logCrossingmy=α+βlogOilmy+μm+νy+εmy\log \text{Crossing}_{my} = \alpha + \beta \log \text{Oil}_{my} + \mu_m + \nu_y + \varepsilon_{my}

where Crossingmy\text{Crossing}_{my} denotes monthly U.S. border encounters of Venezuelan nationals and Oilmy\text{Oil}_{my} denotes Venezuelan oil revenues. The presence of cointegration is associated with a stationary (I(0)) residual εmy\varepsilon_{my}. An alternative but complementary approach considers year-over-year changes (first differences) within an ARDL or error-correction structure, retaining the error-correction term (ECT) from the levels regression. However, Bahar and Hausmann diverged from these conventions by conducting cointegration testing on first differences of unlogged series, i.e., on ΔCrossingt\Delta\text{Crossing}_t and ΔOilt\Delta\text{Oil}_t, rather than on their levels (Rodríguez et al., 24 Dec 2025).

2. The Engle–Granger Cointegration Test: Foundations and Correct Specification

The Engle–Granger (1987) method is the standard residual-based approach for testing cointegration between two potentially nonstationary (I(1)) time series. Its canonical implementation consists of two steps:

  1. Regression in Levels: Estimate Yt=α^+β^Xt+ε^tY_t = \hat{\alpha} + \hat{\beta} X_t + \hat{\varepsilon}_t by OLS. Under cointegration, ε^t\hat{\varepsilon}_t is stationary (I(0)); otherwise, it remains I(1).
  2. ADF Test on Residuals: Conduct a Dickey–Fuller-type test on the residuals:

Δε^t=(ρ1)ε^t1+ut\Delta \hat{\varepsilon}_t = (\rho-1)\hat{\varepsilon}_{t-1} + u_t

A statistically significant test statistic—below the MacKinnon critical value—provides evidence against the null hypothesis of a unit root, indicating cointegration in the levels of XtX_t and YtY_t.

3. Misspecification in Bahar–Hausmann Procedure

Bahar and Hausmann applied the Engle–Granger procedure not to the levels but to the first differences of the series. Explicitly, they estimated

ΔYt=a+bΔXt+et\Delta Y_t = a + b \Delta X_t + e_t

and then performed an ADF test on ete_t in

Δet=(ϕ1)et1+vt\Delta e_t = (\phi-1) e_{t-1} + v_t

Since, by construction, if XtX_t and YtY_t are I(1), then ΔXt\Delta X_t and ΔYt\Delta Y_t are I(0), the residual ete_t from this regression is I(0) under very general conditions. The ADF will almost always reject the null of a unit root, wrongly indicating cointegration. This diagnostic error virtually ensures a spurious "finding" of cointegration—evidence for a relationship in first differences, not in the original nonstationary series (Rodríguez et al., 24 Dec 2025).

4. Monte Carlo Evidence on Misspecification

Monte Carlo simulations conducted by Rodríguez and Bravo provide quantitative evidence of the flaw inherent in the Bahar–Hausmann approach. The simulations generate 1,000 replications using independent random walks Xt=Xt1+u1tX_t = X_{t-1} + u_{1t} and Yt=Yt1+u2tY_t = Y_{t-1} + u_{2t}—each I(1), with no cointegration by construction:

  • The correct Engle–Granger test, applied to levels, rejects the null of no cointegration in approximately 5.3% of cases at nominal size (consistent with Type I error).
  • The Bahar–Hausmann misspecified test, on first differences, rejects in 100% of cases—demonstrating a complete lack of size control and a 100% false positive rate.

This empirically confirms that the misspecified approach will systematically result in spurious inference (Rodríguez et al., 24 Dec 2025).

Test Empirical Reject Rate Interpretation
Engle–Granger (levels, correct) 5.3% Appropriate size
Bahar–Hausmann (differences, miss) 100% Systematic false positive

5. Theoretical Rationale for the Flaw

If XtX_t and YtY_t are I(1), their first differences ΔXt\Delta X_t, ΔYt\Delta Y_t are I(0). OLS regression of ΔYt\Delta Y_t on ΔXt\Delta X_t yields residuals ete_t that are also I(0) due to properties of linear combinations of stationary series. As a result, ADF tests on ete_t will virtually always reject the null of a unit root, not because of any long-run cointegrating relationship, but simply as an artifact of testing on stationary (I(0)) data. Rejecting the unit root null in this context therefore conveys nothing about the cointegration properties of the levels, resulting in entirely spurious inference (Rodríguez et al., 24 Dec 2025).

6. Best-Practice Guidelines for Cointegration Analysis

To ensure valid inference in Bahar–Hausmann-type applications or similar cointegration studies, the following procedural guidelines are foundational:

  1. Integration Pre-testing: Establish the order of integration for each series using (A)DF tests on both levels and differences. Only proceed to cointegration testing if both series are convincingly I(1).
  2. Cointegration in Levels: Apply Engle–Granger or Johansen tests on levels (log- or level-transformed data as dictated by the theory).
  3. Critical Value Selection: Use Engle–Granger MacKinnon critical values; do not substitute standard t-tables.
  4. Long-run Estimation: If cointegration is found, estimate long-run coefficients using dynamic OLS, FM-OLS, or comparable methods to mitigate endogeneity and serial correlation.
  5. Error-correction Models: Build ARDL or error-correction models in differences, always incorporating the lagged cointegrating residual as the error-correction term (ECT).
  6. Transformation Consistency: Align the transformation (log/level/differences) between cointegration tests and subsequent regressions; mismatch leads to inconsistent inference.
  7. Seasonal and Structural Breaks: Include dummies for structural or seasonal effects directly in the cointegrating regression when warranted by data characteristics (Rodríguez et al., 24 Dec 2025).

7. Implications and Controversies

The central controversy arising from the Bahar–Hausmann regressions pertains to the misapplication of cointegration tests, resulting in invalid inferences regarding the long-run relationship between migration flows and oil revenues. The 100% spurious rejection rate demonstrated both theoretically and empirically implies that any findings derived from their procedure offer no valid basis for inference. The episode underscores the necessity of rigorous adherence to the statistical theory underlying cointegration and the dangers of specification error, particularly regarding the proper dimension (levels versus differences) upon which such relationships are to be tested (Rodríguez et al., 24 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Bahar-Hausmann Regressions.