- The paper introduces a hidden Markov model to capture cointegration among Brent, WTI, and Shanghai futures, enabling profitable statistical arbitrage.
- It employs stochastic filtering and an EM algorithm to estimate regime-switching parameters in an OU process modeling the spread.
- Results show that strategies including Shanghai futures yield higher returns, with the PredI strategy achieving a 72.77% annualized return.
Statistical Arbitrage via Hidden Markov Models
This paper (2309.00875) introduces a statistical arbitrage strategy in international crude oil futures markets, focusing on Brent, WTI, and Shanghai futures. It models the cointegration spread using a mean-reverting regime-switching process governed by a hidden Markov chain, and employs stochastic filtering and the EM algorithm for parameter estimation and strategy implementation. The analysis reveals that including Shanghai futures yields profitable arbitrage opportunities, even with conservative transaction costs.
Cointegration and Spread Modeling
The study begins by establishing the cointegration among Brent, WTI, and Shanghai crude oil futures prices. The spread process St​ is defined as a linear combination of these futures prices:
St​:=λ0+i=B,S,W∑​λiFti​
where λ0,λB,λS,λW are the cointegration vector λ.
To capture time-varying cointegration regimes, the paper models the spread as an OU-HMM:
dSt​=a(Xt​)(β(Xt​)−St​)dt+ξ(Xt​)dWt​
where Xt​ is a Markov chain with N states, and a(Xt​), β(Xt​), and ξ(Xt​) are functions representing the speed of mean reversion, long-run mean, and volatility in each regime, respectively. As the Markov chain is unobserved, stochastic filtering techniques are used to estimate the current regime and model parameters.
The estimation of parameters a, β, ξ and the transition matrix Πis achieved using a filter-based EM algorithm. The continuous-time model is discretized to accommodate discrete observations and facilitate parameter estimation. The discrete-time version of the spread is given by:
yt+1​=γ(Xt​)+α(Xt​)yt​+η(Xt​)zt​
where yt​ represents the discrete-time spread, zt​ is a sequence of i.i.d. standard normal random variables, and α(Xt​), γ(Xt​), and η(Xt​) are functions of the Markov chain state.
Statistical Arbitrage Strategies
The paper outlines five statistical arbitrage strategies based on opening and closing signals derived from the spread's deviation from its equilibrium value. Each strategy uses the cointegration vector to determine trading positions, with positions opened when the spread deviates sufficiently from equilibrium and closed upon reversion.
- Plain Vanilla (PV): A position is opened whenever the spread differs from zero.
- Probability Interval (ProbI): Positions are opened when the spread exceeds a dynamically estimated Bollinger Band.
- Prediction Interval (PredI): Positions are opened based on a forward-looking prediction interval derived from the OU-HMM.
- Realized Increment (RI): Positions are opened when the spread increment is significantly larger than usual, based on empirical quantiles.
- Predicted Increment (PI): Positions are opened based on predicted spread increments derived from the OU-HMM.
These strategies are compared against passive benchmarks: a buy-and-hold position on the S{content}P500 index and the Invesco DB Oil Fund ETF. Performance is evaluated using annualized returns and Sharpe ratios, accounting for proportional transaction costs.




Figure 1: Opening/closing signals of the statistical arbitrage strategies described in Section \ref{subsec:ParisTradingStrategies} over the test sample of Section \ref{subsec:AnIllustrativeExample}.
Empirical Results and Analysis
The empirical analysis uses daily and weekly futures prices for Brent, WTI, and Shanghai futures from March 2018 to June 2023, dividing the data into training and test samples. Unit root tests confirm that the futures prices are integrated of order one, and cointegration analysis indicates one cointegrating relationship among the three futures. Estimation of the OU-HMM parameters reveals that a two-state Markov chain is optimal for capturing regime switching in the spread process.
The results indicate that strategies incorporating the Shanghai futures exhibit higher returns, with the PredI strategy achieving the highest annualized return of 72.77% and a Sharpe ratio of 1.0108, assuming transaction costs of 80bps. The adjustment coefficients from the VECM suggest that the Shanghai futures price adjusts to the long-run equilibrium faster than Brent and WTI, potentially explaining the higher profitability of strategies involving Shanghai futures.

Figure 2: Performances of the strategies with respect to c; t0​= 03/26/2018, tB​= 07/01/2022.
Monte Carlo simulations are used to assess the riskiness of the statistical arbitrage strategies. VaR estimates reveal that strategies involving Shanghai futures carry considerable financial risk. A sensitivity analysis is performed by varying transaction costs, the starting date of the sample, and the breaking date between training and test sets. The performance is robust with respect to transaction costs as the S{content}P500 passive strategy only becomes preferable when costs approach 1%.
Conclusion
The study demonstrates the potential for profitable statistical arbitrage strategies in international crude oil futures markets by incorporating the Shanghai futures contract and modeling the cointegration spread with a hidden Markov model. The results suggest that the Shanghai futures offer unique arbitrage opportunities due to its faster adjustment to long-run equilibrium, despite the associated financial risks. Further research could explore alternative investment strategies based on this model and stochastic optimal control under partial information.