Hawkes Process with Power-Law Kernel

Updated 5 February 2026

The Hawkes process with a power-law kernel is defined by an intensity function that decays as a heavy-tailed power law, modeling long-range memory and clustering of events.
The methodology involves precise kernel parametrization using maximum likelihood estimation and nonparametric Wiener–Hopf inversion to accurately capture heavy-tailed influences.
Empirical studies show that power-law kernels effectively replicate persistent event clustering in high-frequency applications, supporting theoretical scaling limits and rough volatility analysis.

A Hawkes process with a power law kernel is a self-exciting point process where the influence of past events on the current intensity decays as a heavy-tailed power law. This structure generates long-range dependence in event arrivals and underpins a variety of phenomena in empirical systems, including high-frequency financial markets, seismic activity, and neural spiking. The power law kernel characteristically induces slow memory decay and, near criticality, leads to scaling limits governed by rough (fractional) diffusions.

1. Mathematical Definition and Power Law Kernel Parametrization

A (univariate) linear Hawkes process $N(t)$ is defined by the $\mathcal{F}_{t^-}$ -predictable intensity

$\lambda(t) = \mu + \int_0^t \phi(t-s)\,dN(s),$

where $\mu \geq 0$ is the exogenous baseline rate and $\phi\colon \mathbb{R}^+ \to \mathbb{R}^+$ is the excitation kernel. In the power law case,

$\phi(t) = \alpha (t + c)^{-p}$

with parameters $\alpha > 0$ (amplitude), $c \geq 0$ (short-time regularization), and $p > 1$ (decay exponent). For Mittag-Leffler-type heavy tails, as used in fractional Hawkes processes,

$\phi_\beta(t) = \alpha f_\beta(t),\quad f_\beta(t) = t^{\beta-1} E_{\beta,\beta}(-t^\beta),$

with $\beta \in (0,1]$ , giving a tail $\phi(t) \sim \alpha \beta t^{-(1+\beta)}$ as $t \rightarrow \infty$ , and $E_{\cdot,\cdot}$ is the Mittag-Leffler function (Chen et al., 2020, Habyarimana et al., 2022).

The $L^1$ norm ("branching ratio") is $\|\phi\|_1 = \alpha c^{1-p}/(p-1)$ (for $p > 1$ ). Stationarity requires $\|\phi\|_1 < 1$ , while $\|\phi\|_1 = 1$ is the critical (boundary) case. For kernels with tail index $p \approx 1$ , as observed in financial microstructure, stationarity is marginal and empirically cutoffs are present (Hardiman et al., 2013, Bacry et al., 2011).

2. Statistical Inference and Estimation

Estimation of a power-law Hawkes process kernel poses distinct technical challenges due to the non-Markovian, long-memory nature of the kernel. Two main strategies are prevalent:

Maximum Likelihood Estimation (MLE): The log-likelihood for observed arrival times $\{t_i\}$ is

$\mathcal{L} = \sum_{i=1}^N \log \lambda(t_i) - \int_0^T \lambda(t)\,dt.$

To manage computational cost, the kernel is often approximated as a sum of exponentials, reducing the complexity from $O(N^2)$ to $O(N)$ (Hardiman et al., 2013). Numerical optimization may be carried out with quasi-Newton (BFGS) routines.

Nonparametric Wiener–Hopf Inversion: Empirical covariances are linked, via a system of integral equations, to the kernel. The causal Wiener–Hopf equation for the conditional law $g(\tau)$ is

$g(\tau) = \phi(\tau) + \int_0^\tau \phi(s) g(\tau-s) ds,$

which can be solved by discretization (Nyström or piecewise-linear methods) to extract $\phi$ from observed data (Bacry et al., 2014, Bacry et al., 2014, Bacry et al., 2011). Logarithmic time grids are essential to recover long tails faithfully over multiple decades (Bacry et al., 2014).

Empirical studies in high-frequency finance consistently recover kernel exponents $p \in [1, 1.3]$ for market orders, with robust recovery of $p$ and $\alpha$ across sampling parameters and instruments (Bacry et al., 2011, Bacry et al., 2014, Hardiman et al., 2013).

3. Power Law Kernel in Multivariate and Markovian Approximations

In high-dimensional (e.g., multivariate or order-book) models, the intensity of each process is defined as

$\lambda_i(t) = \mu_i + \sum_{j=1}^D \int_0^t \phi_{ij}(t-s) dN_j(s),$

with each $\phi_{ij}(u) = \alpha_{ij}/(u + \varepsilon_{ij})^{\beta_{ij}}$ (Batra, 19 Mar 2025, Bacry et al., 2014).

Practical implementation leverages exponential approximations: $\phi(t) \approx \sum_{k=1}^n w_k e^{-\beta_k t},$ enabling a Markov embedding that allows the Hawkes intensity to be described by a finite-dimensional SDE system, with explicit control over the $L^1$ -approximation error (Khabou et al., 15 Jul 2025, Kanazawa et al., 2020).

For simulation, Ogata’s thinning method is extended to account for the power-law memory, occasionally using a mixture of exponentials for efficiency (Chen et al., 2020).

4. Criticality, Long Memory, and Scaling Limits

The slow decay of the kernel ( $p \approx 1$ ) yields algebraic autocorrelation and clustering of events. In financial applications, empirical kernels integrate nearly to unity ( $n \sim 1$ ), indicating criticality: event clusters are long-lived, and the system operates at the boundary between stationarity and explosion (Hardiman et al., 2013). Detrended Fluctuation Analysis and covariance methods confirm scaling exponents consistent with theoretical predictions.

Scaling limits of nearly critical heavy-tailed Hawkes processes, with kernels $\phi(t)\sim t^{-(1+\alpha)}$ and $\alpha \in (1/2,1)$ , yield (after rescaling) integrated fractional diffusions: $Y_t = \int_0^t f^{\alpha,\lambda}(t-s)\,ds + \frac1{\sqrt{\mu^* \lambda}} \int_0^t f^{\alpha,\lambda}(t-s)\sqrt{Y_s} dW_s,$ where $f^{\alpha,\lambda}$ is the Mittag-Leffler density and the process $Y_t$ is a "fractional CIR" with Hurst parameter $H=\alpha-1/2\in(0,1/2)$ , characterizing "rough" volatility (Jaisson et al., 2015, Horst et al., 2023, Xu et al., 23 Apr 2025). This provides a microstructural foundation for rough volatility observed in finance, contrasting with the "classical" (light-tailed) case, which produces ordinary Brownian CIR limits.

5. Empirical Findings and Domain Applications

Empirically, power-law Hawkes kernels capture the endogenous clustering of financial events over six or more decades of time, explaining the persistence in volatility and the scale-free nature of market responses (Bacry et al., 2014, Hardiman et al., 2013). Models with such kernels outperform exponential-kernel Hawkes in replicating key features of limit order books, including cluster size, long memory, and multi-scale event interdependence (Batra, 19 Mar 2025).

In bivariate and multivariate extensions, the kernels’ self- and cross-excitation structure uncovers both persistent clustering and mean-reverting competitive dynamics across asset classes and trade types (Bacry et al., 2014, Batra, 19 Mar 2025). The empirical branching ratio is typically close to unity, with over 95% of events attributed to endogenous activity (Bacry et al., 2014).

6. Theoretical Implications: Field Theory and Critical Behavior

Field-theoretic approaches, embedding Hawkes processes with power-law kernels into infinite-dimensional Markovian SPDEs, clarify the emergence of heavy-tailed intensity distributions near criticality (Kanazawa et al., 2020). At or near $n=1$ , the steady-state intensity PDF displays power-law scaling $\sim \lambda^{-1 + 2\nu_0\alpha}$ up to a critical cutoff, which diverges as $n \rightarrow 1$ . This regime is identified as an intermediate asymptotic, with exponential truncation away from criticality.

The critical point corresponds to a transcritical bifurcation in the underlying field equations, linking the statistical mechanics of self-exciting point processes and theories of critical phenomena.

7. Implementation Considerations and Limitations

Key practical insights include:

For kernels with $p \leq 1$ , integrability fails; a short-time cutoff or regularization is necessary (Bacry et al., 2014, Bacry et al., 2014).
Nonparametric estimation requires logarithmic grids to robustly capture the kernel across time scales (Bacry et al., 2014).
Markovian approximations remain tractable via exponential sum representations, with explicit $L^1$ error controls (Khabou et al., 15 Jul 2025).
Multivariate models risk overfitting with poor sample sizes due to increased parameterization (Batra, 19 Mar 2025).
Parameter estimation is more complex than for exponential kernels due to non-Markovian memory, demanding careful computational strategies.

Overall, the Hawkes process with a power law kernel forms an essential modeling paradigm in the analysis of long-memory, self-exciting phenomena, particularly in high-frequency finance, by enabling rigorous microstructural explanations of rough volatility and highly endogenous event clustering (Hardiman et al., 2013, Horst et al., 2023, Bacry et al., 2014, Jaisson et al., 2015).