Logarithmic Relative Entropy

Updated 31 January 2026

Logarithmic relative entropy is a divergence measure defined via a logarithmic function that quantifies differences between probability distributions, uniquely determined by axioms like additivity, convexity, and the data-processing inequality.
Extensions such as q-deformations and quantum generalizations provide robust tools for statistical inference, hypothesis testing, and quantum information protocols.
Its operational significance spans model selection, maximum-entropy inference, and recovery bounds in both classical and quantum settings, influencing a broad range of research areas.

Logarithmic relative entropy quantifies the divergence of one probability distribution from another using a logarithmic measure, and is most commonly formalized as the Kullback–Leibler (KL) divergence. Its axiomatic foundation fully singles out the logarithmic form from broader families of divergences, and its operational role spans the core of information theory, inference, and statistical modeling. Logarithmic relative entropy is intrinsically linked to deep principles such as additivity, data-processing, and convexity, and admits categorical characterizations, quantum generalizations, and robust statistical deformations.

1. Axiomatic and Functional Characterizations

The logarithmic relative entropy, $D(P\|Q)$ , between two finite probability distributions $P$ and $Q$ on the same alphabet, is defined as

$D(P\|Q) = \sum_{i} P_i \log\frac{P_i}{Q_i}.$

This form is uniquely specified, up to scale and choice of logarithm base, by three axioms (Gour et al., 2020):

Monotonicity under mixing (convexity in the first argument):

$D\bigl(\lambda P_{1}+(1-\lambda)P_{2}\;\|\;Q\bigr)\leq \lambda D(P_1\|Q)+(1-\lambda) D(P_2\|Q).$

Data-processing inequality (DPI):

$D(PW\|QW)\leq D(P\|Q)$

for any stochastic channel (right-stochastic matrix) $W$ .

Additivity on product distributions:

$D(P_1\otimes P_2\|Q_1\otimes Q_2) = D(P_1\|Q_1) + D(P_2\|Q_2).$

Normalization by $D((1,0)\|(1/2,1/2)) = \log 2$ fixes the scale. Together with continuity and lower-semicontinuity, these axioms ensure non-negativity, faithfulness ( $D(P\|Q)=0$ iff $P=Q$ ), and sandwich bounds between the Rényi-$0$ and Rényi- $\infty$ divergences: $D_0(P\|Q)\leq D(P\|Q)\leq D_\infty(P\|Q)$ where $D_0(P\|Q) = -\log \sum_{i:P_i>0} Q_i$ and $D_\infty(P\|Q) = \log\max_i P_i/Q_i$ . This establishes the unique structure of the logarithmic relative entropy among information-theoretically meaningful divergences (Gour et al., 2020, Leinster, 2017).

2. Extensions: Generalized and Parameterized Logarithmic Relative Entropies

The logarithmic structure can be deformed in a controlled fashion. A fundamental one-parameter generalization is the family of $q$ -logarithmic relative entropies (Tsallis or Rényi-type divergences). These retain symmetry and a deformed chain rule: $D_q(P\|Q) = \frac{1}{q-1}\left(\sum_i p_i^q q_i^{1-q} - 1\right)$ where the $q$ -logarithm is defined by $\ln_q(x) = (x^{1-q} - 1)/(1-q)$ , reducing to $\ln x$ as $q \to 1$ and thus $D_q \to D$ (the ordinary KL divergence) (Leinster, 2017). This $q$ -deformation conforms to an axiomatic basis of symmetry and $q$ -multiplicativity, replacing additivity.

Further, more robust generalizations appear in robust statistics and inference, such as the Logarithmic Norm Relative Entropy (LNRE), parametrized by $(\alpha,\beta)$ , with the KL divergence as the limit $\alpha,\beta\to 1$ : $\mathcal{RE}^{\mathcal{LN}_{\alpha,\beta}}(g,f) = \frac{\alpha}{\beta(\beta-\alpha)}\log\int g^\beta f^{\alpha-\beta} - \frac{1}{\beta-\alpha}\log\int g^\alpha + \frac{1}{\beta}\log\int f^\alpha.$ The suitability of such deformations is evidenced by their interpolation between classical and robust estimation in contaminated settings (Singh et al., 15 Oct 2025).

3. Quantum Logarithmic Relative Entropy

In the quantum context, the Umegaki–Araki relative entropy for density operators $\rho,\sigma$ on a finite-dimensional Hilbert space is given by

$D(\rho\|\sigma) = \mathrm{Tr}[\rho(\log\rho - \log\sigma)],$

and for general von Neumann algebras via the relative modular operator and Haagerup $L^1$ -densities (Wirth, 12 May 2025). Key properties—monotonicity (quantum DPI), joint convexity, and additivity—mirror the classical axiomatic picture.

Specialized quantum generalizations such as the Belavkin-Staszewski relative entropy and quantum Tsallis relative entropy arise in contexts demanding different operational or geometric properties. For instance, quantum Tsallis relative entropy for $q\in[0,1)$ is defined as

$D_{q}(\rho\|\sigma) := \frac{\mathrm{Tr}[\rho - \rho^q \sigma^{1-q}]}{1-q} = \mathrm{Tr}[\rho^{q}(\ln_q\rho - \ln_q\sigma)]$

utilizing the operator calculus for $q$ -logarithms (Shi et al., 2019).

4. Operational and Statistical Significance

Logarithmic relative entropy underpins the mathematical formulation of model selection, hypothesis testing, and Bayesian inference. The Csiszár–Sanov theorem equates large-deviation rates with KL divergence: for empirical distribution $f^D$ and model $f^M$ , $P(f^D|f^M) \asymp \exp[-n D(f^D\|f^M)]$ , and maximum likelihood estimation corresponds to minimization of $D(f^D\|f^M)$ (0808.4111).

Dual roles include:

Maximum-entropy inference: $D(P\|Q)$ yields the minimum discrimination information principle and, for $Q$ uniform, recovers Shannon entropy maximization.
Alternating minimization: The EM algorithm is framed as alternating divergence minimization in missing data problems.
Statistical bounds: Pinsker's inequality and Cramér–Rao-type results are generalized to robust divergences, with classical behavior recovered as parameters limit to unity (Singh et al., 15 Oct 2025).

5. Category-Theoretic and Bayesian Perspectives

Relative entropy admits a category-theoretic characterization as a unique (up to scale) functor from the category of finite probability spaces and measure-preserving maps with stochastic right-inverses (FinStat) to $[0,\infty]$ , additive under composition, vanishing on optimal hypotheses, convex-linear under probabilistic choice, and lower semicontinuous. Explicitly, for objects $(X,q)$ , morphisms $(f,s)$ , the relative entropy is

$RE(f,s) = \sum_{x\in X} q_x \ln\frac{q_x}{p_x}$

where $p_x$ is the prior induced via $s$ (Baez et al., 2014).

This functoriality encodes additivity (sequential measurements), convexity (randomization), and lower semicontinuity (robustness under approximation). The same approach yields the classical information gain and offers a categorical analog of quantum Petz-type characterizations.

6. Stability, Recovery, and Quantum Information Inequalities

Recent work sharpens the core inequalities (data-processing, joint convexity, strong subadditivity) for the logarithmic relative entropy by quantifying the "defect" through norms of Petz recovery maps, providing tight remainder terms: $N(\beta)\left\|\sigma_2^{\beta}\rho_{12}^{1-\beta} - \sigma_{12}^{\beta}\rho_{12}^{1-\beta}\sigma_2^{\beta}\right\|_2^{1/\alpha(\beta)} \leq S(\rho_{12}\|\sigma_{12})-S(\rho_1\|\sigma_1)$ with explicit dependence on the interpolation parameter $\beta$ (Vershynina, 2018). These remainder bounds are saturated exactly at recovery (equality) situations and extend to strong subadditivity and its operator versions, providing operational meaning to near-equality in monotonicity and additivity.

7. Applications, Generalizations, and Future Directions

Logarithmic relative entropy remains a central quantity in both theoretical and applied domains. It drives exponential decay results for quantum Markov semigroups, is pivotal in deriving modified logarithmic Sobolev inequalities, and its robust extensions offer improved performance in contaminated or adversarial settings (Wirth, 12 May 2025, Singh et al., 15 Oct 2025). Generalized information-geometric frameworks extend the reach of relative entropy and underline its unique status as the logarithmic measure of statistical divergence.

The continual extension to nonextensive, escort, and robust forms—as well as deep connections to convex and information geometry, variational principles, and operator theory—suggest that logarithmic relative entropy will retain its foundational role in the theory and practice of information.