Conditional Similarity-Sensitive Entropy
- Conditional similarity-sensitive entropy is an extension of classical entropy that integrates a user-specified similarity kernel to capture fuzzy similarities in the state space.
- The framework recovers standard Shannon entropy for trivial kernels while unveiling new behaviors, such as the failure of conventional conditional monotonicity with fuzzy kernels.
- It underpins rigorous coarse-graining and data-processing inequalities, offering practical insights into the impact of kernel choices on entropy measures.
Conditional similarity-sensitive entropy generalizes classical conditional entropy to situations where a user-specified similarity structure is imposed on the state space. Formally, it is defined within the framework of a kernelled probability space, which incorporates a symmetric, measurable similarity kernel on a base space, yielding an entropy functional that encodes not only uncertainty but also the similarity geometry prescribed by . The conditional form, , measures the average unpredictability of given , as perceived through . This framework recovers standard Shannon entropy and mutual information for trivial (identity or partition) kernels, while enabling new behaviors when admits partial similarities (“fuzzy” kernels), including possible failure of classical inequalities such as conditional monotonicity. This construction supports rigorous data-processing inequalities and coarse-graining rules via the law-induced kernel formalism, as developed in the measure-theoretic setting by Leinster, Roff, and Miller (Miller, 6 Jan 2026).
1. Kernelled Probability Spaces and Similarity-Sensitive Entropy
Let be a probability space. A similarity kernel is a symmetric, measurable function satisfying for all and symmetry . The “typicality” function is
which must satisfy for -almost every . The quadruple is called a kernelled probability space.
Similarity-sensitive entropy is then defined as
which, in the finite state case (with a pmf), reduces to
where .
If dominates another kernel pointwise ( -almost everywhere), then . This kernel monotonicity reflects the effect of similarity softening.
2. Conditional Similarity-Sensitive Entropy and Mutual Information
Given a joint law on , define the conditional law of given as . For each , the conditional typicality is
The pointwise conditional entropy is
The averaged conditional entropy is
while the associated mutual information is
For finite and kernel , these definitions reduce to sums as in standard discrete information theory.
3. Coarse-Graining, Induced Kernels, and Data-Processing
Given a measurable map with pushforward law , define the law-induced kernel on by
where is a disintegration of along . The pullback dominates .
The similarity-sensitive coarse-graining inequality is
and more generally, a data-processing inequality holds for all channels realized via Markov kernels: mapping to , the entropy is monotonic under the induced law: For the conditional case, if , then
with the induced kernel on .
4. Reduction to Classical Entropy for Trivial Kernels
For the identity (delta) kernel, , one recovers Shannon entropy: , and . For partition kernels, where encodes block structure via an equivalence relation, the entropy and mutual information coincide with the entropy and mutual information of the induced coarse variable . Formally: Monotonicity is recovered: and .
5. Phenomena Unique to Fuzzy Kernels
When is fuzzy (e.g., non-block-diagonal with entries in ), classical monotonicity properties can fail. For suitable and , with similarity values , , and some joint distribution , it is possible that
This marks a key departure from the classical framework and demonstrates that, for genuinely fuzzy kernels, conditional similarity-sensitive entropy is not necessarily monotone under conditioning.
By contrast, in the strictly binary-state case with and , the entropy functional is strictly concave, ensuring for all joint laws.
6. Structural Invariants and Isomorphism Properties
The law of the typicality random variable under is an isomorphism invariant of the kernelled probability space . For example, if is non-atomic or admits infinitely many values, cannot be a partition kernel corresponding to finitely many equivalence classes. This invariant plays a role in distinguishing the structural complexity of different kernelled spaces.
7. Summary and Connections
Conditional similarity-sensitive entropy is defined by disintegrating the law of given , calculating a pointwise entropy relative to the original similarity kernel, and averaging over . The framework supports universal coarse-graining and data-processing inequalities via the law-induced kernel. For identity or partition kernels, the classical Shannon theory is recovered exactly. For general , new phenomena such as potential failure of conditional monotonicity arise, underscoring the subtleties introduced by non-trivial similarity structures (Miller, 6 Jan 2026).