Parametric QCD: Optimal Sequential Change Detection

Updated 27 January 2026

Parametric quickest change detection is a sequential framework that identifies abrupt shifts in data distributions using known pre-change and unknown post-change parametric models.
It employs minimax and Bayesian criteria alongside likelihood ratio tests to balance detection delay with controlled false alarm rates.
Advanced procedures like window-limited CuSum and GLR-CuSum tests achieve asymptotic optimality even under non-stationary and uncertain post-change settings.

Parametric quickest change detection (QCD) is a sequential inference framework designed to optimally detect abrupt changes in the probability distribution of a data sequence, where both pre-change and post-change distributions belong to parametric families. The model assumes observations are initially generated according to a known (stationary) pre-change distribution, while following an unknown change point, the distribution switches to a post-change family, potentially with parametric or structural uncertainty, and possibly non-stationarity over time. Rigorous minimax and Bayesian criteria under false-alarm constraints are used to measure the performance of decision rules, and advanced likelihood-based or information-theoretic procedures are employed to achieve asymptotically optimal detection delay. This article provides a comprehensive account of the mathematical structure, universal performance bounds, principal QCD procedures, key extensions, and current research frontiers in the parametric regime, with an emphasis on recent developments for non-stationary and uncertain post-change settings.

1. Mathematical Framework and Performance Criteria

Let $X_1,X_2,\dots$ be independent observations. There is an (unknown) change point $\nu \in \{1,2,\dots\} \cup \{\infty\}$ . Pre-change, observations follow a stationary law $f_0(x)$ . Post-change, at time $k \leq n$ , observations are distributed as $f_{\theta,n,k}$ , where $\theta$ is an unknown parameter and the post-change law may be non-stationary (i.e., $f_{\theta,n,k}$ may depend on both the observation index and the time of change) (Liang et al., 2021).

Key quantities:

Log-likelihood ratio (LLR): For hypothesized change point $k \leq n$ ,

$Z_{n,k}^\theta \equiv \log \left[ \frac{f_{\theta,n,k}(X_n)}{f_0(X_n)} \right]\,.$

Cumulative Kullback–Leibler divergence growth: For post-change parameter $\theta$ and true change at $\nu$ ,

$g_{\nu,\theta}(n) = \sum_{i=\nu}^{\nu+n-1} \mathbb{E}_{\nu,\theta}[ Z_{i,\nu}^\theta ]\,,$

with the crucial assumption that $g_{\nu,\theta}(n)\to\infty$ for $n\to\infty$ (uniformly in $\nu$ ) and the growth is super-logarithmic in the sense that $\log g_{\theta}^{-1}(x) = o(x)$ as $x\to\infty$ .

Performance is measured by minimax delay metrics under false-alarm control:

False alarm rate (FAR): $\mathrm{FAR}(\tau) = 1/\mathbb{E}_\infty[\tau] \leq \alpha$ .
Detection delay:
- Lorden’s worst-case average delay (WADD):
$\mathrm{WADD}(\tau) = \sup_{\nu \geq 1} \operatorname{ess\,sup}\; \mathbb{E}_{\nu,\theta} \left[ (\tau - \nu + 1)^+ \mid \mathcal{F}_{\nu - 1} \right]\,.$ - Pollak’s SADD (supremum average delay over stopping after change):

$\mathrm{SADD}(\tau) = \sup_{\nu \geq 1} \mathbb{E}_{\nu,\theta} [\tau - \nu + 1 \mid \tau \geq \nu ]\,.$

2. Universal Lower Bounds and Asymptotic Optimality

A central result in parametric QCD is the existence of a universal first-order lower bound on the achievable detection delay for any stopping rule with a prescribed FAR. Define the minimal per-sample information rate

$I_* = \inf_{\theta \in \Theta} \liminf_{n\to\infty} g_{\nu,\theta}(n)/n\,.$

Then as $\alpha \to 0$ ,

$\inf_{\tau: \mathrm{FAR}(\tau)\leq\alpha} \mathrm{WADD}(\tau) \geq g^{-1}(|\log\alpha|) \cdot (1+o(1))\,.$

If the information grows linearly ( $g(n)\sim nI$ ), this bound specializes to

$\inf_\tau \sup_{\nu,\theta} \mathbb{E}_{\nu,\theta}[\tau - \nu] \geq \frac{|\log\alpha|}{I_*} + o(|\log\alpha|)\,.$

This lower bound is realized (asymptotically in the rare-false-alarm regime) by a family of likelihood-ratio-based procedures, both for known and for unknown post-change parameters (Liang et al., 2021).

3. Likelihood-Based Detection Procedures

3.1. Window-Limited CuSum (Known Post-Change)

When post-change densities are fully specified, the window-limited CuSum test operates as

$W_n(w) = \max_{n-w < k \leq n} \sum_{i=k}^{n} \log \left[ \frac{f_{i,k}(X_i)}{f_0(X_i)} \right].$

The stopping time is

$\tau_C(w,b) = \inf \{ n: W_n(w) \geq b \}\,,$

with window length $w_\alpha \to \infty$ slower than $g^{-1}(|\log\alpha|)$ and threshold $b_\alpha \sim |\log\alpha|$ . For such choices, this procedure achieves first-order asymptotic optimality: $\mathrm{WADD}(\tau_C(w_\alpha, b_\alpha)) \sim g^{-1}(|\log\alpha|).$ (Liang et al., 2021)

3.2. Window-Limited Generalized Likelihood-Ratio (GLR) CuSum (Unknown Post-Change)

For parametric uncertainty, the window-limited GLR-CuSum test is

$G_n(w) = \max_{n-w < k \leq n} \sup_{\theta \in \Theta_\alpha} \sum_{i=k}^{n} \log \left[ \frac{f_{\theta,i,k}(X_i)}{f_0(X_i)} \right].$

The procedure stops at

$\tau_G(w,b) = \inf\{ n: G_n(w) \geq b \}.$

With grid refinement and regularity (LLR Hessians bounded), setting $b_\alpha \sim |\log\alpha|$ controls the FAR, and for each $\theta$ ,

$\mathbb{E}_{\nu,\theta}[\tau_G - \nu] = g_\theta^{-1}(|\log\alpha|)(1+o(1)), \quad \alpha\to0.$

This achieves pointwise first-order optimality relative to the information bound (Liang et al., 2021).

3.3. Windowless CuSum and Classical Settings

In the classical i.i.d. parametric case, the well-known CuSum (Page's test) and Shiryaev procedures are rigorously optimal (under minimax and Bayesian criteria, respectively), with performance obeying

$\mathrm{WADD} \sim \frac{h}{D(f_1\Vert f_0)} \sim \frac{|\log\alpha|}{D(f_1\Vert f_0)}\,,$

where $D(f_1\Vert f_0)$ is the Kullback–Leibler divergence between the pre- and post-change densities (Veeravalli et al., 2012).

4. Practical Aspects: Implementation and Computational Complexity

Window-limited CuSum and GLR-CuSum methods are designed for computational efficiency and memory tractability. The per-sample cost is dominated by window size $w$ (typically taken as slightly larger than the anticipated detection delay), and maximizations over grid $\Theta_\alpha$ for unknown post-change parameters. For exponential family or Gaussian models, sufficient statistics enable recursive updating, yielding $O(w)$ cost per sample (Lau et al., 2019). The test statistics are stopped when crossing thresholds set as multiples of $|\log\alpha|$ , with FAR guarantee derived from renewal-theoretic or martingale bounds (e.g., Doob’s inequality) (Veeravalli et al., 2012, Liang et al., 2021).

The use of a compact grid $\Theta_\alpha$ and growing window size (as $\alpha\to0$ ) ensures both estimation accuracy and conservative false-alarm control, as the probability of missing the worst-case parameter decays. Window selection involves a tradeoff: too small impairs consistency, too large increases computational cost (Liang et al., 2021).

5. Extensions: Non-Stationarity, Dependent Data, and Generalizations

The methodology robustly extends to:

Non-stationary post-change regimes: Provided the cumulative expected LLR grows sufficiently fast, window-limited GLR-CuSum asymptotic optimality holds under the modified information bound.
Dependent or non-i.i.d. data: Replace the LLR increments with conditional LLRs, and impose a strong law of large numbers over the normalized sum to maintain the universal bound. With suitable mixing assumptions, the same detection procedures retain first-order and even pointwise optimality (Liang et al., 2021).
Other regularity conditions: Uniform convergence in probability of the normalized cumulative LLR and vanishing tail fluctuations are required for the validity of lower bounds and control of error probabilities. Very slow KL divergence growth (e.g., logarithmic) is excluded.
Controlling local probabilities of false alarm (PFA): Local windowing ensures meaningful guarantees under more refined local PFA constraints.

6. Key Applications and Benchmarks

Parametric QCD frameworks described here underpin real-time surveillance and monitoring in diverse domains, including:

Industrial and mechanical fault detection: (e.g., rotating machinery, critical infrastructure monitoring) (Lau et al., 2019).
Epidemiological monitoring: (e.g., pandemic detection with shifting distributions over time) (Liang et al., 2021).
Financial market surveillance: Detection of regime changes from vector-valued time series (e.g., in electricity markets) via parametric QCD-based methods (Hoseinpour et al., 20 Jan 2026).
Sensor and network security: Including change propagation across hybrid sensors or in anonymous, heterogeneous networks (0812.3742, Sun et al., 2022).

Notably, window-limited GLR-based QCD reduces detection delay by 30–50% in simulation compared to finite-moment methods or naive two-stage approaches for the same ARL to false alarm, and enables earlier actionable response in real systems (Lau et al., 2019, Liang et al., 2021).

7. Current Trends and Ongoing Research

Recent developments address:

Efficient and near-optimal schemes for both pre- and post-change parametric uncertainty, using online gradient-based estimation and recursively weighted statistics to avoid sliding windows with growing memory and computational overhead (Jarboui et al., 2021).
First-order optimal procedures for controlled sensing, where action selection at each time adapts to maximize instantaneous KL divergence (the “windowed Chernoff–CuSum” approach) (Veeravalli et al., 2023).
QCD in high-dimensional settings, with delay characterized by normalized high-dimensional KL divergences, and plug-in estimators (e.g., shrinkage-based covariance estimation) to minimize estimation-induced delay (Malinas et al., 7 Feb 2025).
Structured scenarios such as nuisance or confusing changes, where advanced multi-statistic CuSum variants (e.g., S-CuSum, J-CuSum) are required to disambiguate critical from secondary or confounding changes (Lau et al., 2019, Chen et al., 2024).
Data-efficient variants for resource-constrained or streaming environments, often using adaptive thresholding, data-skipping, or exploration-exploitation strategies.

The mathematical theory, especially the universality of information bounds and optimality of likelihood-based sequential tests, continues to inform practical algorithm design and benchmark analysis across statistical quality control, engineering diagnostics, and real-time system monitoring. For detailed proofs and technical development, see (Liang et al., 2021, Veeravalli et al., 2012), and references therein.