Innovation-Block Differential Entropy

Updated 19 January 2026

Innovation-block differential entropy is an extension of classical entropy that quantifies uncertainty in blocks of innovation processes using Doob decompositions and whitened projections.
It underpins practical applications in nonlinear filtering, reservoir computing, and compressibility analysis by linking the block structure and innovation capacity to sample complexity.
It also guides adaptive rate control and decision-based metrics by leveraging the geometric and statistical properties of innovations in dynamical learning systems.

Innovation-block differential entropy quantifies the information content or uncertainty associated with blocks (finite or infinite sequences) of innovation processes arising in filtered probability spaces, dynamical learning systems, signal processing, and statistical mechanics. It generalizes classical differential entropy to settings where innovations are defined as Doob components orthogonal to past filtrations, with applications ranging from nonlinear filtering and @@@@1@@@@ to the compressibility analysis of stochastic processes. The metric reflects not only the inherent randomness of the innovation process but also its block structure, rate dimension, and capacity constraints.

1. Formal Definitions and Doob Innovations

Innovation-block differential entropy is constructed by considering a process $X_t \in \mathbb{R}^d$ , its input-generated filtration $\mathscr{F}^{\rm in}_t=\sigma(u_s:s \le t)$ , and the one-step Doob decomposition: $\langle X_t \rangle = E[X_t \mid \mathscr{F}^{\rm in}_t], \qquad \Delta X_t = X_t - \langle X_t \rangle,$ where $\Delta X_t$ is the innovation, defined as the unpredictable component orthogonal to the history. When the covariance $\Sigma_{XX} = E[X_t X_t^\top]$ is invertible, innovations are "whitened": $\Delta Z_t = \Sigma_{XX}^{+/2} \Delta X_t,$ and further projected onto a trimmed innovation subspace $\mathcal{U}_\tau$ by $P_\tau$ (orthonormal projector), yielding $Y_t = P_\tau \Delta Z_t$ . The block of innovations,

$Y_t^{(b)} = [Y_{t-b+1}^\top, ..., Y_t^\top]^\top \in \mathbb{R}^{L_\tau b},$

serves as the fundamental object whose differential entropy is given by

$h(Y_t^{(b)}) = - \int_{\mathbb{R}^m} f_{Y^{(b)}}(y) \log f_{Y^{(b)}}(y) dy,$

with $m = L_\tau b$ (Polloreno, 12 Jan 2026).

In path-space nonlinear filtering, the innovation process $Z$ associated with an observed signal $U_t = B_t + \int_0^t \dot{u}_s ds$ (driven by Brownian motion $B_t$ and drift $\dot{u}_t$ ) is given by (Ustunel, 2013)

$Z_t = U_t - \int_0^t \hat{u}_s ds, \quad \hat{u}_s = E_P[\dot{u}_s \mid \mathcal{U}_s].$

For block entropy on $[0,1]$ , the relative entropy of the innovation law w.r.t. Wiener measure $\mu$ is

$H(Z(\nu) \| \mu) = \frac{1}{2} E_\nu \left[ \int_0^1 |\hat{u}_s|^2 ds \right],$

which equals the block "kinetic energy" of the best predictable drift over the interval.

2. Block Entropy Rate, Quantization, and Entropy Dimension

For stationary innovation processes in continuous time, the block differential-entropy rate is formalized by quantizing time ( $\Delta t$ ), amplitude ( $\delta$ ), and block length ( $T$ ) (Ghourchian et al., 2017): $X_i^{(\Delta t)} = \int_{(i-1)\Delta t}^{i\Delta t} X(t) dt, \qquad X_{i;\delta}^{(\Delta t)} = (1/\delta) \left\lfloor \delta X_i^{(\Delta t)} + 1/2 \right\rfloor \cdot \delta,$ and block entropy

$H_{T,\Delta t, \delta}(X) = H[X_{1;\delta}^{(\Delta t)}, ..., X_{N;\delta}^{(\Delta t)}].$

The block differential-entropy rate is obtained as

$h(X) = \lim_{\Delta t\to0} \lim_{\delta\to0} \lim_{T\to\infty} \frac{1}{T} H_{T,\Delta t, \delta}(X).$

In regimes where random variables are discrete-continuous, the block entropy contains a "rate dimension" $\kappa(n)$ analogous to Rényi's entropy dimension.

Closed-form asymptotics for $\alpha$ -stable innovation processes with stability $0 < \alpha \leq 2$ yield

$h(X) = -\frac{1}{\alpha} \log \Delta t + h(X_0),$

while for impulsive Poisson innovations with rate $\lambda$ and jump law $A$ ,

$h(X) = h(A) + \log \lambda - 1,$

with lower entropy rate signifying higher compressibility (Ghourchian et al., 2017).

3. Capacity, Entropy Growth, and Geometric Structure

Innovation capacity $C_i$ is defined as the trace of the expected conditional covariance projected onto the active subspace: $C_i = \mathrm{Tr}[N \Sigma_{XX}^+], \quad N = E[\mathrm{Cov}(X_t \mid \mathscr{F}^{\rm in}_t)],$ partitioning the observable rank into predictable and innovation components. In linear-Gaussian (Johnson–Nyquist) regimes with $\Sigma_{XX}(T) = S + T N_0$ ,

$\Delta X_t \sim \mathcal{N}(0, T N_0), \quad h(\Delta X_{1:b}) = \frac{b}{2} \sum_{i=1}^r \log(2\pi e T \lambda_i),$

where $\lambda_i$ are nonzero eigenvalues of $N_0$ .

The entropy bound is extensive: $h(Y_{1:T}) \gtrsim \frac{C_i}{2 \tau} T \log \left( \frac{\tau}{2 L_\star^2} \right ),$ so block entropy grows linearly in effective innovation dimension and block length (Polloreno, 12 Jan 2026).

Geometrically, in whitened coordinates, complementary ellipsoids represent predictable and innovation directions: $\mathcal{E}_{\mathrm{pred}} = \{\Gamma^{1/2} y: y \in \mathbb{B}_r \}, \; \mathcal{E}_{\mathrm{innov}} = \{ (\Pi_r - \Gamma)^{1/2} y: y \in \mathbb{B}_r \},$ with innovation axes $\sqrt{1-\gamma_k}$ .

4. Filtering, Learning, and Application Domains

Innovation-block differential entropy provides operational control over nonlinear filtering and signal estimation tasks, particularly by quantifying the information content contributed by unpredictable innovations relative to a reference process (such as Wiener measure) (Ustunel, 2013). For practical filtering:

The block entropy per step approximates $(1/2) \sum_k | \hat{u}_{t_k} |^2 \Delta t$ ;
Low-entropy blocks signify low innovation-energy and correspond to high estimation quality.

Extensive innovation-block entropy also underpins sample complexity in generative modeling: learning the induced block law to $o(\alpha)$ total variation error requires $\Omega(C_i T / (\tau \alpha^2))$ samples, supporting generative reservoir learning (Polloreno, 12 Jan 2026).

In compressibility contexts, block differential entropy ranks innovation processes, with impulsive Poisson innovations exhibiting finite entropy rates, and heavy-tailed $\alpha$ -stable processes showing divergent rates with decreasing stability $\alpha$ (i.e., being more compressible as $\alpha$ decreases) (Ghourchian et al., 2017).

5. Localization, Truncation, and Rate Control

For signals where Novikov's criterion for change-of-measure is violated, block entropy can be localized using stopping times (Ustunel, 2013): $\tau_n = \inf\{ t: \int_0^t | \hat{u}_s |^2 ds > n \} \wedge 1, \quad \hat{u}_s^{(n)} = \hat{u}_s 1_{s \leq \tau_n},$ yielding localized entropy identities

$H(Z(\nu_n) \| \mu) = \frac{1}{2} E_{\nu_n} \left[ \int_0^{\tau_n} | \hat{u}_s |^2 ds \right].$

Taking $n \to \infty$ recovers the full-interval result via monotone convergence.

In trimmed innovation subspaces, the variance floor $\tau$ bounds $L_\tau$ : $L_\tau \leq \frac{C_i}{\tau}, \qquad L_\tau \geq \max\left\{ 0, \frac{-\tau r}{1-\tau} \right\}.$ These bounds allow fine control over block entropy growth and distinguishable history packing.

6. Comparative Perspectives: Shannon, Rényi, and Knowledge Measures

Classical Shannon entropy is nonselective—it is sensitive to all probability-mass rearrangements, regardless of relevance to a reference challenge (Samid, 2010). Samid's MARK (Missing Acquirable Relevant Knowledge) localizes entropy measurement by incorporating "intervals of interest" (IOI, IOF), quantifying only knowledge relevant to narrowing solution uncertainty. The continuous analogue implements block entropy via the averaged maximal window-coverage function: $ARK = \frac{1}{\mathrm{IOF} - \mathrm{IOI}} \int_{\mathrm{IOI}}^{\mathrm{IOF}} TT(I; p) dI, \qquad MARK = 1 - ARK,$ with $TT(I; p)$ the maximal interval probability. MARK curves facilitate tracking knowledge acquisition in R&D, risk management, and opportunity exploitation, complementing block differential entropy by focusing on decision-relevant uncertainty.

7. Summary and Current Trends

Innovation-block differential entropy is now recognized as a central tool for quantifying information growth in blocks of innovation processes, closely tied to the innovation capacity, geometric structure of the underlying reservoir or signal space, and the compressibility properties of stochastic models. The extensive scaling of entropy in block length and innovation dimension underpins the sample complexity of learning, distinguishable history enumeration, and the identification of compressible processes. The linkage with operational filtering, adaptive rate control, and decision-based entropy metrics (e.g., MARK) emphasizes its foundational role across information theory, statistical mechanics, and learning systems (Ustunel, 2013, Ghourchian et al., 2017, Polloreno, 12 Jan 2026, Samid, 2010).