Back-Flow of Distinguishability: Non-Markovian Insights

Updated 30 January 2026

Back-flow of distinguishability is defined as the increase in the ability to differentiate system states during open-system evolution, signaling information flow from the environment back to the system.
It is quantified using trace distance, Holevo divergence, and other contractive metrics that rigorously measure revivals in state distinguishability and correlations.
This concept underpins the study of CP-divisibility, quantum memory, and finds applications in models ranging from spin-star systems to process tensor tomography in machine learning.

Back-flow of distinguishability refers to the increase in the ability to discriminate between different system states during their evolution under an open dynamical process, interpreted as information returning from the environment to the system. It forms a foundational operational paradigm in the quantitative study of non-Markovian dynamics in quantum and classical systems, and is intimately linked to concepts of divisibility, quantum memory, and various contractive divergences. Rigorous treatments have established that revivals of distinguishability, as measured by trace distance, Holevo divergence, or other contractive metrics, serve as direct witnesses of non-Markovianity, subject to nuanced classical versus quantum distinctions.

1. Mathematical Definition and Operational Interpretation

For a finite-dimensional system evolving under a family of CPTP maps $\{\Lambda_t\}_{t\ge0}$ , the trace distance between two states $\rho_1(t)$ and $\rho_2(t)$ at time $t$ is

$D[\rho_1(t),\rho_2(t)] = \frac{1}{2}\|\rho_1(t)-\rho_2(t)\|_1.$

The instantaneous information flux is

$\sigma(t;\rho_1,\rho_2) = \frac{d}{dt} D[\rho_1(t),\rho_2(t)].$

A positive $\sigma(t; \rho_1, \rho_2)$ for some interval signals a back-flow of distinguishability, interpreted as information flowing from the environment back to the system. The Breuer–Laine–Piilo (BLP) measure of non-Markovianity quantifies this as

$\mathcal{N}_{\rm BLP}[\Lambda] = \sup_{\rho_1(0),\rho_2(0)} \int_{\sigma>0} dt\,\sigma(t; \rho_1, \rho_2).$

A dynamical map is termed BLP-Markovian if $D[\rho_1(t),\rho_2(t)]$ never increases for any $\rho_1$ , $\rho_2$ ; otherwise, it exhibits information backflow (Chruściński et al., 2011).

2. Distinguishability Quantifiers and Contractivity

The operational foundation of back-flow analysis rests on contractive divergences: measures $\mathfrak{S}(\rho,\sigma)$ satisfying

Boundedness: $0 \leq \mathfrak{S}(\rho,\sigma) \leq 1$ , zero if $\rho = \sigma$ , unity if $\rho$ and $\sigma$ are orthogonal.
Monotonicity: $\mathfrak{S}(\Lambda[\rho], \Lambda[\sigma]) \leq \mathfrak{S}(\rho, \sigma)$ for all CPTP $\Lambda$ .
Weak triangle inequalities.

Distinguishability may be quantified by trace distance, weighted Helstrom norms, Jensen–Shannon or Holevo skew divergences; each yields operationally equivalent but sometimes technically distinct variants of information back-flow.

The Jensen–Shannon distance, for instance,

$D_{\mathrm{JS}}(p,q) = \frac{1}{2}D_{\mathrm{KL}}(p\|m) + \frac{1}{2}D_{\mathrm{KL}}(q\|m),\quad m = \frac{1}{2}(p+q),$

satisfies contractivity and is bounded. The Holevo skew divergence $K_\mu$ is constructed to be normalized and contractive, capturing information in binary ensembles:

$K_\mu(\rho,\sigma) = \frac{S(\mu\rho + (1-\mu)\sigma) - \mu S(\rho) - (1-\mu)S(\sigma)}{h(\mu)},$

where $h(\mu)$ is the binary Shannon entropy (Smirne et al., 2022).

3. Divisibility and Back-Flow: Classical, Quantum, and Constructive Links

A central mathematical insight is the relationship between divisibility (the existence of CPTP intermediate propagators $V_{t,s}$ such that $\Lambda_t = V_{t,s} \circ \Lambda_s$ ) and the monotonicity of distinguishability.

For bijective maps, Bylicka–Johansson–Acín established:

A dynamical map is CP-divisible if and only if, for all pairs of equiprobable states on the system plus an ancilla of dimension $d+1$ , the trace distance never increases,

$\Lambda_t \text{ CP-divisible} \iff t \mapsto D[(\mathcal{I}\otimes\Lambda_t)\rho_1, (\mathcal{I}\otimes\Lambda_t)\rho_2] \text{ is non-increasing} \ \forall \rho_1,\rho_2.$

Their constructive proof shows that whenever divisibility fails, it is possible to explicitly construct states that exhibit back-flow without even requiring entanglement between system and ancilla (Bylicka et al., 2016).

"Bound non-Markovianity" denotes indivisible maps for which no pair of system states ever increases in distinguishability: $N_{\rm RHP} > 0$ , $N_{\rm BLP} = 0$ (Chruściński et al., 2011).

4. Quasiprobabilistic and State-Independent Witnesses

Traditional BLP-type measures require optimization over initial states and are thus computationally demanding. The quasi-stochastic framework circumvents this by representing states as real vectors $\mathbf{q}$ and CPTP maps as quasi-stochastic matrices $S^{\Lambda}$ .

Monotonicity of the collision entropy,

$H_2(\mathbf{q}) = -\log(\mathbf{q}^T\mathbf{q}),$

under the map is guaranteed if

$(S^{\Lambda_t})^T S^{\Lambda_t} \leq \mathbb{I}.$

Violations directly certify information backflow without state-pair optimization. The rate

$\zeta(t) = \frac{d}{dt}[(S^{\Lambda_t})^T S^{\Lambda_t}],$

integrated over $\zeta > 0$ , defines a measure $\mathcal{N}$ applicable in arbitrary dimensions, achieving necessary and sufficient criteria for random unitary channels on qutrits via explicit inequalities in the rates $\gamma_\alpha$ (Onggadinata et al., 16 Jan 2025).

5. Relation to Quantum Memory and Classical Mixtures

Not every observed back-flow of distinguishability implies genuinely "quantum memory." Classical mixtures of elementary processes, defined as those not increasing the trace distance for pairs indistinguishable in a given measurement basis, can generate transient increases of distinguishability.

Hierarchy of process classes:

Type 0: Convex mixtures of classical maps.
Type I: Block-diagonal elementary.
Type II: Diagonal elementary.

Quantum memory is detected only if the process cannot be simulated by such convex combinations, which operationally can be witnessed by violations (e.g., $X(\rho)>1$ for the Choi state) (Banacki et al., 2020).

Examples include depolarizing channels with transient backflow due to classical memory and Pauli dynamics exhibiting strong quantum memory not decomposable into elementary maps.

6. Information Back-Flow in Machine Learning: Process Tensor Tomography

Recent work generalizes the operational paradigm to neural training processes, treating sequential optimizer interventions, batch choices, and augmentations as instruments in a multi-time process tensor. Distinguishability is measured on output distributions (e.g., softmax responses) using TV, JS, or Hellinger metrics.

A two-step protocol quantifies back-flow:

$D_1$ (after one intervention) and $D_2$ (after two); if $\Delta_{\mathrm{BF}} = D_2 - D_1 > 0$ , non-Markovianity in training memory is certified.

Empirical studies show positive back-flow in SGD with carried momentum and batch overlap, which collapses under a causal break (reset optimizer state). The witness is robust and model-agnostic, permitting diagnostic comparison of optimizers, schedules, and curricula (Sevetlidis et al., 23 Jan 2026).

7. Bounds and Physical Interpretation

A general quantitative bound for revivals of distinguishability asserts

$\Delta_S\mathfrak{S}(t,s) \leq \phi(\mathfrak{S}(\rho_{SE}(s), \rho_S(s) \otimes \rho_E(s))) + \phi(\mathfrak{S}(\sigma_{SE}(s), \sigma_S(s) \otimes \sigma_E(s))) + \phi \circ \phi (\mathfrak{S}(\rho_E(s), \sigma_E(s))),$

i.e., any revival must be "paid for in advance" by information stored outside the system—in environment or correlations (Smirne et al., 2022).

Exactly solvable models (spin-star, Jaynes–Cummings) demonstrate this quantitatively: all measures of non-Markovianity as back-flow of distinguishability are upper-bounded by information external to the system.

This unifies the operational interpretation with rigorous mathematical structure, making the equivalence between non-Markovianity and information backflow precise.

Key References:

Breuer–Laine–Piilo measure and divisibility relation (Chruściński et al., 2011).
Constructive equivalence for bijective CP-divisible maps (Bylicka et al., 2016).
Quasiprobability-based state-independent witness (Onggadinata et al., 16 Jan 2025).
Classical versus quantum memory structures (Banacki et al., 2020).
Holevo skew and generalized divergence bounds (Smirne et al., 2022).
Process tensor tomography in machine learning (Sevetlidis et al., 23 Jan 2026).