Online Monotone Density Estimation

Updated 10 February 2026

Online monotone density estimation is a framework for sequentially predicting a nonincreasing probability density on [0,1] using log-loss performance metrics.
The online Grenander estimator adapts the classical isotonic MLE to a sequential setting, achieving an O(n^(1/3)) excess KL risk under well-specified i.i.d. conditions.
The expert aggregation approach discretizes the density space and applies exponential weighting to secure an O(√(n log n)) adversarial regret bound while quickly adapting to change-points.

Online monotone density estimation addresses the sequential prediction of an unknown probability density that is known a priori to be monotone nonincreasing on $[0,1]$ . At each time $t$ , the estimator observes a real-valued data stream $x_1, x_2, \ldots$ and outputs a measurable function $\hat{f}_t$ of past data, so that for every $t$ , $\hat{f}_t$ estimates the underlying density $q \in \mathcal{D}$ , where

$\mathcal{D} = \left\{ f : [0,1] \to [0, \infty) \,\Bigg|\, \int_0^1 f(u)\,du = 1,\, f\ \text{nonincreasing}\ \right\}.$

Performance is measured by sequential log-loss $\ell_t(f) = -\log f(X_t)$ and its cumulative version $L(\hat{f}, n) = \sum_{t=1}^n \ell_t(\hat{f}_t) = -\sum_{t=1}^n \log \hat{f}_t(X_t)$ . The theoretical framework considers two benchmarks: the well-specified stochastic setting (oracle risk under i.i.d. draws from $q$ ) and an adversarial setting (pathwise regret compared to the best hindsight monotone density with amplitude bounds).

1. Formal Problem Statement

The data consists of sequentially observed values $X_1, X_2, \ldots$ assumed to be in $[0,1]$ (by linear rescaling). At each $t$ , the algorithm outputs an estimator $\hat{f}_t$ based only on $X_1,\ldots,X_{t-1}$ . The performance of a sequence of estimators $\{\hat{f}_t\}$ is evaluated in two principal regimes:

Stochastic benchmark: $X_1,\ldots,X_n$ are i.i.d. from some $q\in\mathcal{D}$ . The cumulative excess Kullback-Leibler risk is

$\mathrm{Risk}_Q(\hat{f}; n) = \mathbb{E}_Q\left[L(\hat{f}, n)\right] - \mathbb{E}_Q\left[L(q, n)\right] = \sum_{t=1}^n \mathbb{E}_Q [\mathrm{KL}(q \| \hat{f}_t)].$

Adversarial benchmark (pathwise regret): For arbitrary (well-spaced) sequences, regret is measured against the best monotone density in hindsight within amplitude bounds $a\leq f\leq b:$

$\mathrm{Regret}(\hat{f}; n, a, b) = L(\hat{f}, n) - \min_{f\in \mathcal{D}_{a,b}} L(f, n),$

where $\mathcal{D}_{a,b} = \{f\in\mathcal{D}: a\leq f \leq b\}$ .

The adversarial regime requires minimal regularity on the input: points are well-spaced (no two closer than $n^{-\beta}$ ), and not close to the upper boundary ( $\leq 1 - n^{-\gamma}$ ).

2. The Online Grenander (OG) Estimator

The classical Grenander estimator provides the offline (batch) maximum likelihood estimator (MLE) over $\mathcal{D}$ , resulting in a piecewise-constant histogram supported at the order statistics. The online analogue (OG) operates as follows at each $t$ :

Input: Amplitude bounds $0 \leq a \leq b \leq \infty$ ; initialize $\hat{f}_1 \equiv 1$ .
For $t=2,3,\dots$ : Solve

$\hat{f}_t \leftarrow \underset{f \in \mathcal{D}_{a,b}}{\mathrm{argmax}} \sum_{i=1}^{t-1} \log f(X_i).$

Theorem 2.1 (Excess KL risk of OG).

If $X_1, \ldots, X_n$ are i.i.d. $\sim Q$ with $q \in \mathcal{D}_{a,b}$ ,

$\mathrm{Risk}_Q(\hat{f}^{OG}_{a,b}; n) \leq \Gamma_{OG}(a, b) n^{1/3}.$

The derivation utilizes classical entropy arguments: the (offline) Grenander estimator achieves $\mathbb{E} [\mathrm{KL}(q \| \tilde{f}^{MLE}_{t-1, a, b})] = O((t-1)^{-2/3})$ ; summing over $t$ gives an aggregate $O(n^{1/3})$ cumulative risk.

3. The Expert Aggregation (EA) Estimator

To bypass the computational burden of solving isotonic MLEs at each step, the Expert Aggregation (EA) estimator discretizes $\mathcal{D}_{a,b}$ into a finite net of monotone histograms, and employs exponential weighting:

Expert set construction:
- Grid breakpoints $B \approx \{0, \Delta, 2\Delta, \ldots, 1\}$ , $\Delta = n^{-(\beta+1)}$ .
- Log-heights grid $\Lambda = \{\log a + j V \Delta : j = 0,1,\ldots\}$ , $V = \log(b/a)$ .
- $\mathcal{E}_{k, n, \beta}$ : All monotone histograms with at most $k$ bins, breakpoints from $B$ , heights from $\exp(\Lambda)$ , normalized to integrate to 1 and lying in $[a, b]$ .
- Cardinality: $|\mathcal{E}_{k, n, \beta}| \lesssim (n^{\beta+1})^{2k}$ .
Exponential weighting over $m=|\mathcal{E}_{k, n, \beta}|$ experts $\{g_j\}$ , $w_1(j) = 1/m$ , and for $t \geq 1$

$w_{t+1}(j) \propto w_t(j) \cdot g_j(X_t), \qquad \hat{f}^{EA}_{t+1}(x) = \sum_{j=1}^m w_{t+1}(j) g_j(x).$

Mixability Lemma: For log-loss, $\sum_{t=1}^n \log \hat{f}^{EA}_t(X_t) \geq \max_{j=1,\ldots, m} \sum_{t=1}^n \log g_j(X_t) - \log m$ .

Theorem 2.2 (Pathwise regret of EA).

Under mild regularity (the data in $S_n(\beta, \gamma)$ ), choosing $k \asymp \sqrt{n/\log n}$ for $\mathcal{E}_{k, n, \beta}$ yields

$\mathrm{Regret}(\hat{f}^{EA}_{a,b}; n, a, b) \leq \Gamma(a, b, \beta) \sqrt{n \log n}.$

The proof uses histogram compression (bin-merge for $k$ -bin approximation, error $O(n V/k)$ ), bin/height rounding (error $O(n V \Delta + kV)$ ), the bound $\log |\mathcal{E}| \lesssim k \log n$ , and exponential-weights log-loss mixability. Selecting $k \asymp \sqrt{n/\log n}$ produces the displayed regret rate.

4. Log-Optimal Sequential $p$ -to- $e$ Calibration

An application of online monotone density estimation appears in sequential hypothesis testing for calibration of $p$ -values to $e$ -values. Under the null hypothesis, the $p$ -values $P_t$ are sequentially super-uniform: $\Pr(P_t \leq u \mid \text{past}) \leq u$ . A $p$ -to- $e$ calibrator is a nonincreasing function $h : [0,1] \to [0, \infty)$ with $\int_0^1 h(p) dp \leq 1$ , so that $h(P_t)$ is a valid $e$ -value. Admissible calibrators correspond exactly to monotone densities on $[0,1]$ .

Under an alternative $Q$ with density $q \in \mathcal{D}$ , the log-optimal calibrator is the true density: $h^{opt}(\cdot) = q(\cdot)$ . Thus, estimating the log-optimal $p$ -to- $e$ calibrator corresponds to online monotone density estimation of $q$ .

An empirical adaptive procedure: at each $t$ , set $\hat{h}_t = \hat{f}_t$ (OG or EA) applied to the observed $P_1, \ldots, P_{t-1}$ , and define the $e$ -process $M_t = \prod_{s=1}^t \hat{h}_s(P_s)$ .

Theorem 3.1 (Asymptotic log-optimality).

If $P_1,\ldots,P_n$ are i.i.d. $\sim Q$ with $q \in \mathcal{D}_{a,b}$ , then for either OG- or EA-based calibrator,

$\frac{1}{n}\left[ \log M_n(\hat{h}) - \log M_n(h^{opt}) \right] \to 0\quad \text{almost surely}.$

Corollary (Consistency of $e$ -process).

If $Q \neq \mathrm{Uniform}$ and $\int q \log q > 0$ , then $\log M_n(\hat{h})$ grows linearly at rate $\int q \log q > 0$ , implying that the sequential test rejects $H_0$ in finite time with probability 1.

5. Empirical Investigation

Numerical assessments are performed in both well-specified and mis-specified regimes:

Stochastic setting ( $n=1000$ $n = 1000$ , $B=50$ $B = 50$ replicates):
- Densities: linear ( $q(u) = 5/4 - u/2$ ), quadratic ( $q(u) = 3(1-u)^2$ ), and a 4-bin piecewise constant.
- Both OG and EA estimators closely track the oracle log-likelihood $-L(q,t)$ ; EA demonstrates smaller finite-sample regret, particularly when $q$ is milder (e.g., linear).
Change-point (mis-specified) setting:
- Data follows one linear monotone form for $t\leq 200$ , then switches to another.
- EA quickly adapts to the new regime via weight reallocation among experts—incurring substantially lower post-change regret—whereas OG, which relies on the full data history, adapts more slowly.

These experiments corroborate the theoretical $O(n^{1/3})$ excess-risk for OG and $O(\sqrt{n \log n})$ pathwise regret for EA. A plausible implication is that expert aggregation offers practical advantages in nonstationary environments.

6. Summary Table: Algorithmic and Statistical Properties

Estimator	Stochastic (i.i.d.) Excess Risk	Adversarial Regret Bound
Online Grenander	$O(n^{1/3})$ (Theorem 2.1)	Not minimized; recomputes MLE
Expert Aggregation	$O(n^{1/3})$ (via expert approximation)	$O(\sqrt{n \log n})$ (Theorem 2.2)

Both algorithms support online monotone density estimation, but EA provides tighter adversarial guarantees and superior adaptivity in nonstationary or change-point settings.

7. Context and Significance

Online monotone density estimation extends classical nonparametric MLEs to the sequential, potentially adversarial context, providing both theoretical guarantees in KL-risk and worst-case regret. Its connection to sequential calibration for $p$ -to- $e$ procedures emphasizes the foundational role of monotonicity constraints in adaptive hypothesis testing. The link to exponential weighting and mixture-of-experts methods also places this problem at the intersection of shape-constrained inference, online learning, and sequential testing theory. The approach synthesizes classical statistical theory (Grenander estimator, entropy arguments) with modern online learning methods (expert aggregation, mixability) (Hore et al., 9 Feb 2026).

Markdown Report Issue Upgrade to Chat

References (1)

Online monotone density estimation and log-optimal calibration (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Online Monotone Density Estimation.

Online Monotone Density Estimation

1. Formal Problem Statement

2. The Online Grenander (OG) Estimator

3. The Expert Aggregation (EA) Estimator

4. Log-Optimal Sequential $p$ -to- $e$ Calibration

5. Empirical Investigation

6. Summary Table: Algorithmic and Statistical Properties

7. Context and Significance

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Online Monotone Density Estimation

1. Formal Problem Statement

2. The Online Grenander (OG) Estimator

3. The Expert Aggregation (EA) Estimator

4. Log-Optimal Sequential ppp-to-eee Calibration

5. Empirical Investigation

6. Summary Table: Algorithmic and Statistical Properties

7. Context and Significance

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

4. Log-Optimal Sequential $p$ -to- $e$ Calibration