Stylized Facts Alignment GAN

Updated 26 January 2026

The paper introduces differentiable stylized fact losses—capturing fat tails, volatility clustering, leverage effect, and coarse-to-fine volatility correlation—to enhance synthetic financial time series realism.
It integrates these losses with a WGAN-GP backbone, resulting in generated data that mirror true market dynamics, as validated by extensive backtesting against real Shanghai Composite Index returns.
The approach enables more reliable risk management and trading strategy evaluation by producing synthetic data that closely match statistical and functional properties of real financial markets.

The Stylized Facts Alignment GAN (SFAG) is a generative modeling framework designed to overcome critical limitations in synthetic financial time series creation. Conventional GAN-based approaches, notably GANs and WGAN-GP, often produce data that superficially resemble true market returns but fail under rigorous backtesting, mainly due to their neglect of structural characteristics such as extreme tails and asymmetric volatility. SFAG directly addresses these deficiencies by converting four canonical stylized facts—fat tails, volatility clustering, leverage effect, and coarse-to-fine volatility correlation—into differentiable loss terms optimized jointly with a WGAN-GP adversarial loss. This results in synthetic data that reliably mimic real-world market dynamics not only in visual diagnostics but also in trading outcomes, as demonstrated in extensive experiments on Shanghai Composite Index returns spanning 2004–2024 (Zhang et al., 19 Jan 2026).

1. Stylized-Fact Constraints and Differentiable Formulations

SFAG enforces four primary stylized facts observed in financial returns by defining each as a structural loss function. Let real return sequences be $r$ and generated sequences be $\hat r = G_\theta(z)$ , with $z \sim \mathcal N(0, I)$ .

Fat Tails (GPD Tail Index): Financial returns exhibit heavy tails. SFAG fits a Generalized Pareto Distribution (GPD) to threshold exceedances and penalizes deviations in tail indices:

$\mathcal L_{\rm GPD} = |\xi(r) - \xi(\hat r)|$

Volatility Clustering (ACF of Squared Returns): Persistent autocorrelations in squared returns are captured by matching lag- $k$ autocorrelations up to a cutoff $K$ :

$\mathcal L_{\rm ACF} = \frac{1}{K} \sum_{k=1}^K [\rho_k(r^2) - \rho_k(\hat r^2)]^2$

Leverage Effect (Return–Volatility Asymmetry): Negative returns typically predict higher future volatility. SFAG matches the Pearson correlation between past returns and subsequent realized volatility:

$\mathcal L_{\rm Lev} = |\rho(r_t, \sigma_{t+1}) - \rho(\hat r_t, \hat \sigma_{t+1})|$

Coarse-to-Fine Volatility Correlation (CFVC): Volatility across multiple time scales is reproducibly correlated. Let $\Sigma(r)$ assemble realized volatilities over multiple windows; penalize differences in the resulting correlation matrices:

$\mathcal L_{\rm CFVC} = \| \mathrm{Corr}(\Sigma(r)) - \mathrm{Corr}(\Sigma(\hat r)) \|_F$

These losses are all fully differentiable, enabling joint optimization via back-propagation through the generative network.

2. Model Architecture

SFAG employs a WGAN-GP backbone ( $\epsilon = 10$ ) with structurally standard generator and discriminator modules, augmented for stylized-fact alignment:

Generator $(G_\theta)$ :
- Input: $z \in \mathbb R^{100}$ , sampled from $\mathcal N(0, I)$ .
- Output: synthetic return series $\hat r \in \mathbb R^{T}$ , with $T = 2520$ .
- Structure: flexible time-series mapping (temporal CNN or transformer; 1D CNN stack in reported experiments).
Discriminator $(D_\phi)$ :
- Input: sequence of length $T$ .
- Output: scalar realness score.
- Structure: temporal mirror CNN or MLP.

The distinguishing feature is the augmentation of generator’s objective with multiple, differentiable stylized-fact losses; the adversarial backbone remains unchanged.

3. Joint Loss and Training Protocol

The total loss for SFAG’s generator combines the WGAN-GP adversarial loss and a weighted sum of stylized-fact losses:

$\mathcal L_{\rm total} = \mathcal L_{\rm adv} + \lambda_1 \mathcal L_{\rm GPD} + \lambda_2 \mathcal L_{\rm ACF} + \lambda_3 \mathcal L_{\rm Lev} + \lambda_4 \mathcal L_{\rm CFVC}$

where

$\mathcal L_{\rm adv} = \mathbb{E}_z[D_\phi(\hat r)] - \mathbb{E}_r[D_\phi(r)] + \lambda_{\rm gp} \mathcal L_{\rm gp}$

and $\lambda_{\rm gp} = 10$ . The hyperparameters $\{\lambda_i\}$ are ramped linearly from 0 to full value over the initial 20% ($10,000$ iterations) of training to stabilize optimization.

Training pseudocode summary:

Initialize θ, φ
Set Adam(θ, lr=2e−4, β1=0.5, β2=0.9)
Set Adam(φ, lr=2e−4, β1=0.5, β2=0.9)
for iteration = 1 to 50000 do
  for t = 1 to 5 do # Discriminator update
    Sample real batch {r} size 24, z ∼ N(0,I)
    r̂ = Gθ(z)
    Lgp = gradient‐penalty(Dφ, r, r̂)
    Ladv_D = Dφ(r̂) − Dφ(r) + λgp Lgp
    φ ← φ − Adam(∇φ Ladv_D)
  end for
  # Generator update
  Sample z, r̂ = Gθ(z)
  Compute losses Ladv_G, L_GPD, L_ACF, L_Lev, L_CFVC
  Ltotal = Ladv_G + λ1 L_GPD + λ2 L_ACF + λ3 L_Lev + λ4 L_CFVC
  θ ← θ − Adam(∇θ Ltotal)
end for

Stylized-fact weights ramp up in the first 10,000 iterations.

4. Experimental Setup and Evaluation

Experiments utilized daily close-price log-returns from the Shanghai Composite Index (2004–2024; approx. 5,000 data points). Key settings:

Sequence length $T = 2520$ days (about 10 years)
Batch size: 24
Latent dimension: 100
Implementation: PyTorch, NVIDIA A100 GPU

Comparative baselines:

Standard GAN (JS-divergence adversarial loss)
WGAN-GP ( $\ell_{\rm gp} = 10$ )

Evaluation metrics:

Stylized-fact gaps: absolute error in tail index (GPD), ACF (lag 1–20), leverage correlation, CFVC matrix.
Backtest: 60-day momentum strategy (long if past 60-day return $>0$ , else short; $5$ bps transaction cost), measuring annualized return, volatility, Sharpe ratio, max drawdown, VaR (95%), CVaR (95%).

5. Empirical Results

Stylized-Fact Alignment

SFAG demonstrates superior performance in stylized-fact preservation. Average absolute gaps (across five runs):

Model	GPD Tail	ACF	Leverage	CFVC
Standard GAN	0.2615	0.1431	32.4617	0.0863
WGAN-GP	0.0776	0.1053	33.7440	0.1021
SFAG	0.0146	0.0982	32.7516	0.0436

SFAG reduces the GPD tail index gap by over 80% versus WGAN-GP and decreases CFVC error by ~57%. Improvements in ACF and leverage gaps indicate more faithful reproduction of volatility persistence and asymmetric dynamics.

Momentum Strategy Backtest

Backtest results (average across ten generated paths):

Metric	Real Data	Standard GAN	WGAN-GP	SFAG
Annualized Return	33.10 %	2467.24 %	2152.07 %	27.80 %
Annualized Volatility	15.20 %	991.83 %	995.06 %	9.37 %
Sharpe Ratio	2.18	2.49	2.16	2.97
Maximum Drawdown	9.50 %	109.87 %	148.11 %	4.37 %
VaR (95%)	–1.10 %	–78.03 %	–85.79 %	–0.91 %
CVaR (95%)	–2.23 %	–141.79 %	–144.62 %	–0.92 %

Standard GAN and WGAN-GP experience “collapse,” yielding annualized returns and volatilities near $1000\%$ , with catastrophic drawdowns and risk metrics. SFAG’s synthetic data yield backtest performance (return: $27.80\%$ , volatility: $9.37\%$ ) closely aligned with real data, producing plausible risk measures and Sharpe ratios.

6. Significance and Extensions

SFAG evidences that embedding domain-specific stylized facts as differentiable constraints is crucial for transitioning from superficial realism (e.g., visual similarity) to functional usability in financial synthetic data. Its multi-constraint structure enables generated series to pass both statistical diagnostics and trading backtests—unlike prior GAN frameworks focused solely on distributional matching.

Potential extensions and research directions:

Adaptation to multi-asset portfolios and cross-market series (foreign exchange, commodities).
Inclusion of further stylized facts (tail asymmetry, volatility-of-volatility, regime persistence).
Application of alignment losses within diffusion or transformer-based networks for improved long-horizon fidelity and modeling capacity.

7. Implications and Outlook

SFAG exemplifies a shift toward structure-preserving realism in financial generative modeling, which is requisite for synthetic data to have practical utility in risk management and algorithmic trading. Aligning generative objectives with market-specific constraints such as tail properties and multi-scale volatility ensures both visual and functional relevance. A plausible implication is that structure-aware objective functions might be foundational for synthetic financial modeling beyond GANs, including in stochastic diffusion, autoregressive, or transformer-based frameworks (Zhang et al., 19 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

Beyond Visual Realism: Toward Reliable Financial Time Series Generation (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Stylized Facts Alignment GAN (SFAG).