Pinsker-Marginal Bound for f-Divergence

Updated 4 January 2026

Pinsker-Marginal Bound is a set of inequalities that offers an optimal upper bound on f-divergences, notably KL divergence, by incorporating total-variation distance and explicit density-ratio constraints.
It is established using convex analysis and extremal three-atom distributions, which provide tight controls in regimes of high concentration or bounded densities.
The bound generalizes classical and reverse Pinsker inequalities, enabling precise computations in applications such as hypothesis testing, quantization, and distribution synthesis.

The Pinsker-Marginal Bound is an optimal upper bound on $f$ -divergences, most notably Kullback–Leibler (KL) divergence, in terms of the total-variation (TV) distance under additional pointwise constraints on the likelihood ratio. This family of inequalities generalizes and sharpens the classical Pinsker and “reverse Pinsker” bounds, incorporating information about the essential infimum and supremum of the Radon–Nikodým derivative between the pair of probability measures, leading to tight controls in high-concentration or bounded-density regimes. The Pinsker-Marginal bounds also facilitate the computation of worst-case $f$ -divergence for given TV distance and density envelope, with explicit tightness and extremal constructions. They are central in information theory, hypothesis testing, distribution synthesis, and quantization theory.

1. Formal Definition and Principal Results

The general Pinsker-Marginal inequality, as established in Binette (Binette, 2018), upper bounds an arbitrary $f$ -divergence between $P$ and $Q$ constrained by TV distance $\delta$ and essential bounds $m, M$ on the Radon–Nikodým derivative: $\mathcal{A}(\delta,m,M) = \Bigl\{(P,Q): P\ll Q,\; \essinf\tfrac{dP}{dQ}=m,\; \esssup\tfrac{dP}{dQ}=M,\; TV(P,Q)=\delta\Bigr\}$ Let $f:[0,\infty)\to(-\infty,\infty]$ be convex with $f(1)=0$ , and define $D_f(P\|Q) = \mathbb{E}_Q[f(\tfrac{dP}{dQ})]$ . The optimal bound is

$\sup_{(P,Q)\in\mathcal{A}(\delta,m,M)} D_f(P\|Q) = \delta\left(\frac{f(m)}{1-m} + \frac{f(M)}{M-1}\right).$

If $m=1$ or $M=1$ , both sides are set to zero. This covers the full parametrized set of admissible $(P,Q)$ pairs, and achieves equality on simple three-atom distributions.

In the $f(t) = t\log t$ case (KL divergence), it yields

$\sup_{\mathcal{A}(\delta,m,M)} D(P\|Q) = \delta\left(\frac{\log(a)}{a-1} + \frac{\log(b)}{1-b}\right)$

with $a = M^{-1}$ , $b = m^{-1}$ .

2. Variational Method and Tightness

The underpinning argument is based on convex analysis: Jensen’s inequality applied to the likelihood ratio $\kappa = \frac{dP}{dQ}$ on $[m,1]\cup[1,M]$ . The decomposition partitions the domain into $A = \{\kappa\le1\}$ and $A^c = \{\kappa>1\}$ , applies convexity on each, and uses the constraint on TV distance and mean values to achieve sharp two-point upper bounds. The extremal distributions that attain these bounds are three-point measures: $P=(tp, t(1-p), 1-t), \;\; Q=(tq, t(1-q), 1-t)$ with $p/q=m$ , $(1-p)/(1-q)=M$ , and $t$ determined to achieve $TV(P,Q)=\delta$ . Direct computation verifies attainment of the supremum.

Global (unconstrained $\delta$ ) Pinsker-Marginal bounds are maximized at the endpoint $\delta = \frac{(M-1)(1-m)}{M-m}$ . At this point, the supremum coincides with the enveloped Jensen functional (Simic 2009), with the Pinsker-Marginal bound strictly sharper (Binette, 2018).

3. Classical Reverse Pinsker and Pinsker Relations

The Pinsker-Marginal bounds generalize the best-known reverse Pinsker inequalities. For unconstrained measures,

$TV(P,Q)\le\sqrt{2 D(P\|Q)}$

is classical Pinsker. Reverse Pinsker inequalities provide a (typically non-sharp) upper bound on $D(P\|Q)$ in terms of $TV(P,Q)$ . In the binary case, refined reverse Pinsker inequalities scale as $D(P\|Q)\le C\,TV(P,Q)^2$ , with $C$ controlled by minimal values of $Q$ (Sason, 2015).

Pinsker-Marginal bounds sharpen these classical relationships by adding density-ratio constraints; for finite alphabets, they yield quadratic improvements over prior reverse Pinsker results (such as those of Verdú and Csiszár–Talata) in both coefficient and asymptotic behavior (Sason, 2015, Binette, 2018). For $f(t) = |t-1|/2$ (TV), they reproduce the identity $TV(P,Q)=\delta$ .

4. Extensions: Rényi Divergences and $\chi^2$ -Divergence

Pinsker-Marginal inequalities specialize cleanly to Rényi divergences of arbitrary order (including $\infty$ ), with explicit functions of $\delta$ , $m$ , $M$ , and minimal atom probabilities. For $\chi^2$ -divergence ( $f(t) = (t-1)^2$ ), the Pinsker-Marginal bound becomes

$\sup_{\mathcal{A}(\delta, m, M)} \chi^2(P\|Q) = \delta(M-m),$

outperforming prior two-sided bounds and providing sharp characterization of outlier-weighted divergences.

Sason (Sason, 2015) presents further upper bounds for general $f$ -divergence and KL that account for infimum and supremum of the density ratio, minimizing the gap in worst-case scenarios, with precise scaling for finite-alphabet and small- $\delta$ regimes.

5. Special Cases and Applications

A table summarizes key specializations of the Pinsker-Marginal inequality:

$f$	Pinsker–Marginal Bound	Interpretation
$t \log t$	$\delta(\log a/(a-1) + \log b/(1-b))$	KL divergence—optimal in bounded-likelihood
$\|t-1\|/2$	$\delta$	TV—identity
$(t-1)^2$	$\delta(M-m)$	$\chi^2$ —linear in density bounds

Applications include:

Quantization and source coding, where Pinsker-Marginal bounds give KL–TV tradeoffs for design of codes under bounded density hypotheses.
Statistical estimation and hypothesis testing, for non-asymptotic control over error probabilities and minimax risk analysis.
Distribution synthesis and quantization error control, particularly in Bayesian nonparametric settings where worst-case densities are bounded (Binette, 2018).

Pinsker-Marginal bounds strictly improve and generalize results due to Simic (2009) for global $D(P\|Q)$ bounds, and specialize to limit cases articulated by Verdú (2014), Sason–Verdú (2016), and Vajda (1972) by taking $m\to0$ or $M\to\infty$ . Binette (Binette, 2018) establishes optimality in the context of $f$ -divergence, encompassing both unbounded and bounded regimes, as well as multi-atom constructions for tightness.

A comparison with reverse Pinsker inequalities as in Sason (Sason, 2015) reveals that the Pinsker-Marginal bound is exact when the likelihood ratio takes at most two values, and that for finite alphabets the improvement in coefficient for the reverse Pinsker is a factor of two or more in the quadratic regime. The extension to Rényi divergence orders clarifies the regime in which the TV–KL or TV–Rényi relationships are linear versus quadratic in the distance parameter.

7. Illustrative Extremal Construction and Interpretive Remarks

The extremal example central to the Pinsker-Marginal theory uses a three-atom distribution where probability mass is concentrated optimally in two “worst-case” intervals, realizing saturation of Jensen's inequality. These extremal constructions model worst-case (maximal-divergence) situations for quantization and channel synthesis scenarios. The tightness and universality of the Pinsker-Marginal bounds underscore their role as canonical sharp inequalities in both theoretical analysis and practical metric control across information-theoretic disciplines (Binette, 2018).

References

"A Note on Reverse Pinsker Inequalities" (Binette, 2018)
"On Reverse Pinsker Inequalities" (Sason, 2015)

Markdown Report Issue Upgrade to Chat

References (2)

A Note on Reverse Pinsker Inequalities (2018)

On Reverse Pinsker Inequalities (2015)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Pinsker-Marginal Bound.