Petz–Augustin Information Overview

Updated 17 January 2026

Petz–Augustin Information is a generalization of mutual information defined through a variational problem involving Rényi divergence, extending to both classical and quantum contexts.
It features a unique minimizer, the Augustin mean, which is computed via fixed-point and iterative methods that converge geometrically to optimize channel coding error exponents.
Its analytic structure—with continuity, concavity, and real-analytic behavior—underpins strong-converse results and efficient error exponent analysis in information theory.

The Petz–Augustin information, also referred to as Augustin information or Augustin–Csiszár mutual information in classical contexts, is a generalization of the classical mutual information parametrized by a Rényi order $\alpha \in (0,1)\cup(1,\infty)$ . It is defined as a solution to a variational problem involving the expected Rényi divergence between families of output distributions (or states in the quantum context) and an auxiliary measure or density. This quantity arises naturally in error exponent analysis, optimal channel coding, and as the core of several generalizations of information-theoretic quantities in both classical and quantum settings.

1. Definitions and Formulations

Let $W:X\to P(Y)$ denote a channel with finite input and output alphabets and $P\in P(X)$ a probability mass function on $X$ . The order- $\alpha$ Petz–Augustin information is defined by the infimum: $I_\alpha(P,W) := \inf_{Q\in P(Y)} \sum_{x\in X} P(x) D_\alpha\left(W(x)\Vert Q\right),$ where $D_\alpha(p\Vert q) = \frac{1}{\alpha-1}\log \sum_y p(y)^\alpha q(y)^{1-\alpha}$ is the order- $\alpha$ Rényi divergence. In the quantum setting, replacing $W(x)$ with density matrices $\rho_B^x$ , $Q$ with a density operator $\sigma_B$ , and the sum with the trace, this generalizes to

$I_\alpha^{P}(P, \mathscr W) := \inf_{\tau\in \mathcal{S}(\mathcal{H})} \sum_{x\in X} P(x) D^P_\alpha(W_x \Vert \tau).$

For $\alpha\to 1$ , this reduces to the usual mutual information $I(P,W)$ . The unique minimizer in the infimum, when it exists, is called the Augustin mean $q_{\alpha,P}$ or $\mu_\alpha$ (Nakiboglu, 2018, Cheng et al., 2021).

2. Variational Representations and Properties

Several equivalent variational representations exist:

Classical case: $I_\alpha(P;W) = \min_{Q} \mathbb{E}_{x\sim P}[D_\alpha(W(\cdot|x)\Vert Q)]$ .
Quantum case: $I_\alpha^P(\rho_{XB}) = \min_{\sigma_B \in \mathcal{D}(\mathcal{H}_B)}\mathbb{E}_{X\sim P_X}[D^P_\alpha(\rho_B^X\Vert \sigma_B)]$ (You et al., 2021, Chu et al., 10 Jan 2026, Chu et al., 10 Feb 2025).

Key properties:

The Augustin mean is the unique minimizer and also the unique fixed-point of a nonlinear operator ("Augustin operator") defined as an average of order-α tilted versions of $W$ with respect to $Q$ (Nakiboglu, 2018, Cheng et al., 2021).
The mapping $P \mapsto I_\alpha(P,W)$ is uniformly equicontinuous and jointly continuous in $(\alpha, P)$ , and concave in $P$ for $\alpha \in (0,1)$ (Cheng et al., 2018).
The function $I_\alpha(P,W)$ is real-analytic in $\alpha$ for fixed $P$ and $W$ (Nakiboglu, 2018).
The scaled auxiliary function $E_0^P(s, P) := s\,I_\alpha^P(P, \mathscr W)$ , with $s=(1-\alpha)/\alpha$ , is concave in $s\in(-1,0)$ and jointly continuous in $(s,P)$ .

3. Fixed-Point and Iterative Computation Methods

3.1 Classical and Quantum Fixed-Point Iteration

For discrete alphabets, the minimizer $Q^*$ satisfies the fixed-point equation: $Q_{t+1}(y) = \sum_{x} P(x)\frac{W(y|x)^\alpha Q_t(y)^{1-\alpha}}{\sum_{y'} W(y'|x)^\alpha Q_t(y')^{1-\alpha}}.$ This iteration converges to the unique minimizer for $\alpha \in (1/2,1)\cup(1,3/2)$ at a geometric rate in Hilbert’s projective metric (Tsai et al., 2024).

3.2 Quantum Iterative Methods

For quantum channels, algorithms follow similar fixed-point structures: $\sigma_{t+1} = \left( \sum_{x} P_X(x) \frac{[\rho_B^x]^\alpha}{\mathrm{Tr}([\rho_B^x]^\alpha \sigma_t^{1-\alpha})} \right)^{1/\alpha}/\mathrm{Tr}[\cdot]$ with per-iteration and initialization complexities $O(nd^2+d^3)$ and $O(nd^3)$ respectively, where $n$ is the number of inputs and $d$ the state dimension (Chu et al., 10 Feb 2025). Convergence is guaranteed at a linear rate $O\left(|1-1/\alpha|^T\right)$ in the Thompson metric.

3.3 Alternating Optimization for Augustin–Csiszár Mutual Information

An Alternating Optimization (AO) algorithm alternates between optimizing candidate output distributions and reverse channels, generalizing the Arimoto–Blahut algorithm. Each AO step involves basic renormalizations, and convergence is global under mild conditions (Kamatsuka et al., 2024).

3.4 Mirror Descent and Polyak Step

Entropic mirror descent using negative entropy as mirror map provides convergence guarantees under weak conditions even when conventional Lipschitz or Hessian bounds do not apply. A Polyak-style adaptive step size is crucial for practical efficiency and provable convergence to $\delta$ -optimality, extending to quantum settings (You et al., 2021, Chu et al., 10 Jan 2026).

4. Analytic Structure and Error Exponents

Analytic results show that $I_\alpha(P, W)$ and its scaled auxiliary function $E_0^P(s, P)$ possess continuity and concavity properties in the parameter and prior. These properties are instrumental for establishing minimax identities, entropic dualities, and Fenchel duality principles in error-exponent analysis:

The strong-converse exponent in quantum channel coding obeys Sion’s minimax identity, permitting it to be attained by constant-composition codes (Cheng et al., 2018).
Duality theorems connect source-channel tradeoffs with side information in operational exponents.
The Augustin mean and center are essential for the van Erven–Harremoës lower bounds and cost-constrained capacities (Nakiboglu, 2018).

5. Convergence Rates and Complexity

The convergence rate and complexity of iterative algorithms for Petz–Augustin information are now rigorously established.

Algorithm	Convergence Rate	Per-Iteration Cost	Applicable Regime
Classical fixed-point (Tsai et al., 2024)	$O(\gamma^T)$ in $d_H$ , $\gamma=2\|1-\alpha\|$	$O(d^2)$	$\alpha\in(1/2,1)\cup(1,3/2)$
Quantum fixed-point (Chu et al., 10 Feb 2025)	$O(\|1-1/\alpha\|^T)$ in $f_\alpha$	$O(nd^2+d^3)$	$\alpha\in(1/2,1)\cup(1,\infty)$
Mirror descent (quantum) (You et al., 2021, Chu et al., 10 Jan 2026)	$O(\log n/T)$ , $\delta$ -optimality	$O(\log n/\delta \log(1/\delta))$	$\alpha \in [1/2,1)$

The fixed-point iterations are shown to be contractive in natural projective metrics (Hilbert’s and Thompson’s), thus enabling geometric convergence and scalability to high dimensions.

6. Operational and Coding-Theoretic Relevance

Petz–Augustin information governs the optimal error exponents in channel coding beyond the Shannon capacity (coding at rates above capacity for the strong converse, or calculating sphere-packing exponents). The “Augustin capacity”—the maximized Augustin information over all input distributions—characterizes the random coding exponent and meets strong-converse benchmarks for classical-quantum channels (Cheng et al., 2018, Chu et al., 10 Jan 2026).

The unique Augustin mean serves as the operational center in information radius inequalities, cost-constrained capacities, and in dual formulations of rate-distortion and channel coding with side information. The variational and interpolation methods underpinning analytic results provide the functional foundations for modern error exponent analysis in both classical and quantum information theory.

7. Practical Implementation and Extensions

Modern first-order algorithms—including fixed-point, alternating minimization, and mirror descent with Polyak step size—are implementable using only matrix powers, traces, and elementary normalization, making them well-suited for large-scale and high-dimensional quantum channels. The methods are robust to initialization and avoid the stalling issues characteristic of singularities encountered in projected gradient approaches (You et al., 2021, Chu et al., 10 Feb 2025, Chu et al., 10 Jan 2026, Kamatsuka et al., 2024).

Extensions include resource-theoretical measures, hypothesis testing, rate-distortion functions, and other quantum-information convex programs, provided local gradient boundedness is maintained. Recent work generalizes the Blahut–Arimoto mirror-descent interpretation to the full order- $\alpha\in(1/2,1)$ regime, establishing non-asymptotic and explicit complexity bounds for error-envelope computation (Chu et al., 10 Jan 2026).

References:

(Nakiboglu, 2018) B. Nakiboğlu, "The Augustin Capacity and Center"
(Cheng et al., 2018) H. Cheng, M. Mosonyi, "Properties of Noncommutative Renyi and Augustin Information"
(Cheng et al., 2021) Y. Nakiboglu, H. Cheng, "On the Existence of the Augustin Mean"
(You et al., 2021) K. Li et al., "Minimizing Quantum Renyi Divergences via Mirror Descent with Polyak Step Size"
(Kamatsuka et al., 2024) D. Kamatsuka, M. Kazama, T. Yoshida, "Alternating Optimization Approach for Computing α-Mutual Information and α-Capacity"
(Tsai et al., 2024) K. Kato et al., "Linear Convergence in Hilbert's Projective Metric for Computing Augustin Information and a Rényi Information Measure"
(Chu et al., 10 Feb 2025) D. Kamatsuka et al., "A Linearly Convergent Algorithm for Computing the Petz-Augustin Information"
(Chu et al., 10 Jan 2026) D. Kamatsuka et al., "Algorithms for Computing the Petz-Augustin Capacity"