Marked Hawkes Process: Overview & Applications

Updated 9 February 2026

Marked Hawkes processes are self-exciting point processes where events carry random marks that influence future occurrences.
They use mark-dependent kernels and excitation functions to capture clustering and contagion in diverse fields such as finance, seismology, and epidemiology.
Robust inference is achieved through methods like likelihood maximization, penalized estimation, and nonparametric Bayesian approaches.

A marked Hawkes process is a class of self-exciting point processes where each event is characterized by both an occurrence time and an associated random mark, typically influencing the conditional intensity for future events. Formally, these processes generalize linear Hawkes models by incorporating mark–impact functions and mark–dependent or mark–modulated kernels, yielding models that capture clustering, heterogeneity, and contagion phenomena in the temporal evolution of marked events. Marked Hawkes processes are fundamental in modeling phenomena with both time and attribute-based interactions, including financial transactions, earthquake aftershocks, epidemic transmission in stratified populations, and multiplex information diffusion.

1. Mathematical Formulation and Defining Properties

A marked Hawkes process is defined on a probability space, realized as a simple point process where each event at time $T_i$ carries a mark $M_i \in \mathcal{M}$ , with $\mathcal{M}$ a measurable mark space (continuous or discrete). The natural filtration $\mathcal{H}_t$ records the entire event-and-mark history up to $t$ .

The conditional intensity for observing an event at $(t, m)$ , given the past, is of the form

$\lambda^*(t, m | \mathcal{H}_t) = \lambda_g^*(t | \mathcal{H}_t) f^*(m|t, \mathcal{H}_t),$

where

$\lambda_g^*(t | \mathcal{H}_t) = \lambda_0 + \sum_{T_i < t} g(M_i) \mu(t - T_i)$

is the ground (event) intensity, $f^*(m|t, \mathcal{H}_t)$ is the conditional mark density, $g$ is a mark-impact function, and $\mu$ is the temporal excitation (or inhibition) kernel (Laub et al., 2024, Clinet, 2020).

In multivariate and/or nonparametric settings, the most general form for the intensity is

$\lambda_i(t) = \mu_i(t) + \sum_{j=1}^d \int_0^{t^-} \phi_{ij}(t-s, M_j(s))\ dN_j(s),$

where $\phi_{ij}$ is a mark–dependent excitation kernel, and $N_j$ is the counting process for component $j$ (Clinet, 2020, Bonnet et al., 2024). This formulation subsumes multivariate, discrete, continuous, and nonlinear variations.

A key distinction is whether the mark distribution $f^*(m|t,\mathcal{H}_t)$ is history-dependent. Often, $f^*(m|t,\mathcal{H}_t) \equiv f(m)$ (marks i.i.d.), but models with endogenous, history-dependent marks also arise in hybrid extensions and state-dependent Hawkes models (Morariu-Patrichi et al., 2017).

2. Mark-Dependent Kernels, Excitation Structures, and Stationarity

The hallmark of the marked Hawkes process is the interaction between marks and intensity. The kernel $\phi_{ij}(t, m)$ can be:

Multiplicative/separable: $\phi_{ij}(t, m) = a_{ij} g_{ij}(m) \psi_{ij}(t)$ , typical in applications tracking magnitudes (earthquakes, market orders) (Laub et al., 2024, Lim et al., 2018, Xu et al., 2018).
Generalized exponential/Erlang: finite mixtures of $a_{ij}^{(k)}(m)e^{-b_{ij}^{(k)} t} t^{r_{ij}^{(k)}}/{r_{ij}^{(k)}!}$ to capture flexible delay and mark effects, and to ensure Markovianity for efficient inference (Clinet, 2020, Goda, 2021).
Non-separable/coupled: Model forms where $g$ depends jointly on time lag and mark, $g(t, m)$ , as in ETAS-type or nonparametric models (Kim et al., 27 Nov 2025, Joseph et al., 2024).

Stationarity is typically enforced by controlling the expected offspring: $\rho=\int_\mathcal{M}\int_0^\infty g(s; m)\ \xi(m)\ ds\ d\mu(m)<1,$ where $\xi(m)$ is the expected number of offspring due to mark $m$ . For multivariate discretizations (e.g., mark histogram, latent type expansion (Davis et al., 2024)), stationarity is controlled by the spectral radius of the integrated kernel matrix.

In critical regimes ( $\rho=1$ ), scaling limits become non-Gaussian and require heavy-tail analysis (Talarczyk, 15 Apr 2025).

3. Estimation, Inference, and Model Selection

Parameter estimation for marked Hawkes processes is executed via explicit likelihood maximization, EM-type algorithms, penalized/quasi-likelihood, or Bayesian/nonparametric inference:

Parametric MLE: The log-likelihood for observed $(t_i, m_i)$ is

$\ell(\theta) = \sum_{i=1}^n \log \lambda^*(t_i, m_i | \mathcal{H}_{t_i^-}) - \int_0^T \int_\mathcal{M} \lambda^*(s, m | \mathcal{H}_s)\ dm\ ds.$

For exponential kernels, computations can be reduced to $O(n)$ by recursion, and fully marked MLEs are achieved by quasi-Newton methods (BFGS, L-BFGS) (Xu et al., 2018, Bonnet et al., 2024, Clinet, 2020, Laub et al., 2024). Mark penalties (e.g., ridge) combat overfitting in highly parameterized multivariate models (Brisley et al., 2023).

Sparse/penalized estimation: Quadratic-plus-power penalties are imposed for variable selection in high-dimensional marked Hawkes models, yielding oracle rates and consistent identification of active/inactive parameters (Goda, 2021).
Nonparametric and neural estimation: Universal approximation theorems guarantee that shallow neural networks with marks can non-parametrically estimate the mark–dependent kernel $\phi_{ij}(t, m)$ , retaining interpretability and outperforming unmarked methods (Joseph et al., 2024).
Bayesian and nonparametric Bayesian: Gamma process priors on the excitation kernel enable nonparametric modeling and exact MCMC inference, as in nonparametric earthquake models (Kim et al., 27 Nov 2025), with flexible time–mark excitation and internally resolved mark–dependent branching structure.
Testing and model selection: A full battery of parameter and complexity tests is available: Wald and score-type tests for mark effects, and bootstrap-based goodness-of-fit on the rescaled time transformation. Model selection proceeds by comparing nested models—Poisson, unmarked Hawkes, marked Hawkes, and nonlinear Hawkes—optimized to avoid overfitting (Bonnet et al., 2024).

4. Discrete-Time Models, Approximation, and Scaling Limits

Discrete-time marked Hawkes processes are used when event data are binned or only interval counts are available. The discrete model is

$Z_t \mid \mathcal{F}_{t-1} \sim \mathrm{Poi}(\lambda_t),\quad \lambda_t = \nu + \sum_{s=1}^{t-1} \alpha(s) X_{t-s},$

with $X_t = \sum_{j=1}^{Z_t} \ell_{t,j}$ and marks $\ell_{t,j}$ (Wang, 2020, Wang, 2020, Brisley et al., 2023).

Law of large numbers, central limit theorems, and large/moderate deviations are established in both linear and marked models, with subcriticality $\|\alpha\|_1\, \mathbb{E}[\ell] < 1$ ensuring ergodicity. In multivariate and marked settings, this framework is capable of modeling cross-excitation and mark-induced clustering both in continuous and discrete time.

Recent works also provide strong trajectorial convergence results (fractional Sobolev and Skorokhod metrics) quantifying error bounds and rates as discrete approximations converge to their continuous-time limits, with explicit dependence on discretization step and kernel regularity (Coutin et al., 2024).

Scaling limits in the critical regime, especially with heavy-tailed marks, yield stable non-Gaussian fluctuations. In this setting, functional CLTs and random-measure convergence describe both empirical count and pointwise measures, and proofs exploit the Poisson-cluster representation (Talarczyk, 15 Apr 2025).

5. Nonparametric, Hybrid, and State-Dependent Extensions

Marked Hawkes processes are extended in various directions for expressivity and application flexibility:

Nonparametric and flexible marked intensities: Marked Hawkes with non-separable time–mark kernels are representable and arbitrarily well-approximable by high-dimensional multivariate Hawkes processes, with provable $L^1$ -approximation and parameter identifiability. This supports histogram-type, flexible, and non-separable mark–time dependency structures (Davis et al., 2024).
State-dependent/hybrid models: Hybrid marked point processes introduce an auxiliary state process $X_t$ (e.g., queue length), yielding intensities that depend on both history and evolving system states. Here, the joint intensity factors as $\lambda_t(e, x) = \phi(x|e, X_{t^-})\, \psi(e | \mathcal{F}_{t^-})$ , mediating feedback dynamics between marks/events and endogenous states, rigorously constructed under broad conditions (Morariu-Patrichi et al., 2017).
Hypernetwork (piecewise linear/nonlinear) models: Hyper Hawkes processes expand the dimensionality into a latent space and use hypernetworks (recurrent neural networks) to encode history- and data-adaptive recurrence, retaining conditional linearity for interpretability and efficient leave-one-out probing, while attaining state-of-the-art likelihoods (Boyd et al., 2 Nov 2025).

6. Applied Contexts and Notable Case Studies

Marked Hawkes processes have broad empirical and algorithmic impact in areas requiring fine-grained temporal and attribute-specific modeling:

High-frequency finance: Modeling the clustering of aggressive market orders with bivariate marked Hawkes models captures both order flow seasonality and volume–dependent excitation. Empirical fits reveal that self-excitation dominates (branching ratios $\approx 0.3$ –$0.4$) and marks (order volumes) amplify both self and cross terms (Xu et al., 2018). Marked Hawkes volatility models estimate asset and mid-price volatility, with mark-induced clustering tracking realized volatility and market microstructure patterns (Lee et al., 2019).
Earthquake and aftershock modeling: Nonparametric Bayesian approaches estimate magnitude-dependent aftershock triggering densities, exceeding the capabilities of ETAS models and improving forecasting, main-aftershock classification, and probabilistic hazard assessment (Kim et al., 27 Nov 2025, Laub et al., 2024).
Social network and information diffusion: Multivariate marked Hawkes processes infer hidden multiplex (multi-layer) network structures, with marks encoding, for instance, textual or topical features, and event intensity coupled to latent layer structures via topic–susceptibility compatibilities (Suny et al., 2018).
Epidemiology: Latent marked Hawkes models stratified by demographic marks (e.g., age) coupled with contact matrices yield nonparametric, tractable state-space models capturing heterogeneity, instantaneous reproduction numbers, and real-time epidemic inference with efficient particle filtering (Lamprinakou et al., 2022).
Violence and incident monitoring: Marked Hawkes models in discrete time with binary event marks (e.g., alarms vs. ordinary incidents) provide predictive decomposition into routine versus cross-unit contagion, extract interpretable feedback ratios, and optimize intervention strategies in critical environments (Brisley et al., 2023).
Insurance and neuroscience: The marked Hawkes risk process, a compound self-exciting process, models claim arrivals/amplitudes or neural spike trains, and recent work establishes strong convergence guarantees for discrete-time approximations in functional spaces (Coutin et al., 2024).

7. Algorithmic, Statistical, and Theoretical Considerations

Simulation: Ogata-type thinning, superposition-based exact composition, and branching-process simulation (using parent-offspring structure and mark propagation) provide exact and efficient sampling routines, critical for evaluating rare-event or pathwise statistics (Laub et al., 2024, Lim et al., 2018, Kim et al., 27 Nov 2025).
Inference guarantees: Under geometric ergodicity (established for generalized exponential kernels), asymptotic normality of estimators, LAN properties, and convergence of all polynomial moments are established for broad classes of (possibly nonlinear/nonseparable) marked Hawkes models (Clinet, 2020).
Model selection and complexity: Modern procedures balance accuracy and complexity via penalized log-likelihood routines, explicit score and Wald tests, mark–effect testing, and robust goodness-of-fit bootstrap assessments, supporting rigorous empirical model selection (Bonnet et al., 2024, Goda, 2021).
Approximation theory: For arbitrary univariate marked Hawkes models, finite-dimensional multivariate Hawkes representations provide practical and statistically regular means for scalable inference, with $L^1$ -consistency and identifiability guaranteed under weak assumptions (Davis et al., 2024).

Marked Hawkes processes, in their numerous variants, are now central to the rigorous, interpretable modeling of event-driven systems with attribute-based interactions, clustering, and contagion, providing a unified and extensible mathematical/statistical framework for high-frequency data-driven applications across scientific domains.