Heterogeneous Side-Information Cardinalities in DPIC

Updated 11 February 2026

Heterogeneous side-information cardinalities are defined by varying client side-information sizes, leading to unique recursion-based DPIC methods and precise target-size security.
The protocol uses a linearly progressive set model with fixed overlaps, ensuring each client decodes exactly the required number of new messages without overshoot.
Distinct XOR-based coding strategies and increased communication overhead differentiate this approach from homogeneous DPIC, opening avenues for advanced secure coding research.

Heterogeneous side-information cardinalities arise in decentralized pliable index coding (DPIC) when clients hold side-information subsets whose sizes are not uniform across the network. This regime contrasts with the extensively studied homogeneous case and better models many real network conditions, such as time-varying data acquisition and non-uniform knowledge distributions among clients. The technical challenges and capacities induced by heterogeneity—both in side-information sizes and in demands—result in fundamentally different coding strategies and communication costs, especially when combined with stringent security objectives.

1. Formal Problem Formulation: Heterogeneous Cardinalities

In the secure decentralized pliable index coding framework with heterogeneous side-information, there is a library of $M$ distinct messages $\mathcal{X} = \{x_1, \dots, x_M\}$ shared among $C$ clients $\mathcal{C}_1, \dots, \mathcal{C}_C$ . The $i$ th client holds a side-information set $\mathcal{I}_i \subseteq \mathcal{X}$ of cardinality $|\mathcal{I}_i| = K + (i-1)$ . Under the linearly progressive sets with fixed overlap (LPS–FO) model, each $\mathcal{I}_i$ consists of consecutive messages, and adjacent clients have a fixed overlap of $P$ messages (the last $P$ of $\mathcal{I}_i$ coincide with the first $P$ of $\mathcal{I}_{i+1}$ ).

Clients operate under a pliable demand regime: a client is satisfied not by a predetermined message but by decoding any messages previously unknown, provided the final cardinality of its knowledge meets a prescribed target $T$ . The common target is set to the maximum initial side-information size plus one,

$T = |\mathcal{I}_C| + 1 = K + C$

so every client ends with $T$ messages. The decoding requirement is strict: client $\mathcal{C}_i$ must decode exactly $T - |\mathcal{I}_i| = C - (i-1)$ new messages—no more, no fewer. This formalizes both functional justification (network efficiency) and an individual security objective.

2. Security Constraint and Target-Size Guarantee

Security in this heterogeneous DPIC context mandates that no client acquires more than the target $T$ messages; equivalently, no client may decode more than its assigned $C-(i-1)$ new messages. This so-called “target-size” individual security constraint is enforced throughout the protocol so that by the conclusion of the protocol, each client has precisely $T$ messages and cannot infer any additional information beyond those messages. Unlike classical information-theoretic secrecy (which might aim to obscure knowledge from external or colluding adversaries), the constraint here is per-client and deterministic.

This security criterion critically interacts with the heterogeneity. For instance, a client with a smaller initial side-information set must acquire more new messages, whereas those whose initial side-information is larger need fewer. The protocol must ensure that no client learns even a single extra message beyond these prescribed increments, across all recursion levels.

3. Decentralized Transmission Scheme for Heterogeneous Side-Information

In the absence of a central server, clients collaboratively broadcast coded packets using a recursive algorithm. At each recursion level $\ell$ , a block of $C^\ell$ active clients attempts to augment their side-information to achieve $T$ ; the number of clients served in the current round, $r_{\max}^\ell$ , is selected as the unique integer satisfying

$\frac{(r_{\max}^\ell-2)(r_{\max}^\ell-1)}{2} < C^\ell \leq \frac{r_{\max}^\ell(r_{\max}^\ell-1)}{2}.$

The first $r_{\max}^\ell$ clients (in a canonical ordering) are served with $C^\ell$ XOR-coded transmissions (possibly three-wise XORs in special sub-cases); the remaining $C^\ell-r_{\max}^\ell$ clients neither contribute nor gain additional information in this round. The procedure recurses, renumbering the non-served clients for the subsequent level.

Each coded transmission is an XOR of two or (occasionally) three messages, strategically chosen from overlapping and unique portions of client side-information sets. For example, a packet may be of the form

$W = I_{F_1}^k \oplus I_{L_1}^k \quad \text{or} \quad W = I_{F_1}^k \oplus I_{U_j}^k,$

where $I_{F_1}^k$ (first in leading overlap), $I_{L_1}^k$ (first in trailing overlap), and $I_{U_j}^k$ (the $j$ -th unique message for $\mathcal{C}_k$ ) are message indices determined by the LPS–FO structure.

Only targeted clients—those with the exact required overlap—can resolve these XORs to recover a new message. The partitioning and coding design at every recursion level guarantee that security is maintained, strictly preventing non-targeted clients from leveraging any transmission for additional knowledge.

4. Communication Cost: Bounds, Recursion, and Overhead

The communication cost for this secure DPIC with heterogeneously sized side-information is fundamentally higher than the homogeneous scenario, driven by both the recursive protocol and the need to enforce exact per-client targets. Specifically, if $N(C)$ denotes the number of transmissions for $C$ clients, then

$N(C) = C + N(C - r_{\max}),$

with seeds $N(0) = 0, N(1) = 1, N(2) = 3$ . The lower bound (even absent the security constraint) is $C$ transmissions, with this minimum realized only for $C \in \{3,4\}$ . In general, the recursive overhead $N(C) - C$ quantifies the penalty paid for maintaining strict "no-more-than- $T$ " security.

When the overlap and initial side-information cardinalities match LPS–FO model assumptions and security is not enforced, a single round suffices. Under the exact target-size constraint, however, the communication cost unrolls as the sum of steps needed to serve successive smaller blocks of clients. The protocol achieves an explicit closed-form expression via recursion in terms of $C$ and the triangular-number structure determined by $r_{\max}$ .

5. Comparison to Homogeneous Decentralized Pliable Index Coding

In the homogeneous DPIC model—where every client holds side-information sets of identical cardinality and seeks only a single new message—coding strategies are considerably simpler. One round of $C$ XOR-coded packets, each satisfying one client, achieves optimality both in the presence and absence of (individual) security constraints. The additive communication overhead in the heterogeneous regime has no analog in the homogeneous case; here, flexibility (supporting variable demands matching heterogeneity) and “no-overshoot” privacy strictly increase communication complexity.

This gap emphasizes that decentralized systems with significant heterogeneity cannot inherit optimal homogeneous DPIC coding directly. Instead, new code constructions must explicitly track which clients have been served and ensure no overshoot—fundamentally changing the protocol’s design philosophy and efficiency.

6. Model Assumptions, Limitations, and Prospects for Extension

The presented protocol (Padmanabhan et al., 3 Feb 2026) and its analysis presume:

An LPS–FO side-information structure: side-information sizes increase linearly, and each pair of consecutive clients has a fixed overlap $P$ .
$K \geq 2P$ to ensure each client has sufficient unique messages for the prescribed XOR coding steps.
$P \geq r_{\max} - 2$ when entering special subcases of the recursion.

Optimality is achieved only for $C \in \{3,4\}$ ; for larger $C$ , the protocol is non-optimal but remains provably correct and secure. Security is per-client (“individual”) but does not provide information-theoretic secrecy against external or colluding adversaries.

Potential extensions include tightening the excess communication gap $N(C) - C$ , designing for alternative side-information overlap patterns (e.g., arbitrary graphs, probabilistic or random set structures), and introducing block or strong security (protecting against colluding subsets or eavesdroppers). Applicability in federated learning—where heterogeneity is induced by network topology or geographical distribution rather than strict LPS–FO structure—suggests practical relevance and the need for further generalization.

7. Relation to Broader Secure DPIC Literature

In classical secure decentralized pliable index coding with homogeneous side-information (notably, circular shift models (Liu et al., 2020, Liu et al., 2020)), various results establish cases of infeasibility, tight lower and upper bounds under linear coding, and performance multiplicative gaps between centralized and decentralized models (up to a factor of three for secure regimes). However, the introduction of heterogeneous cardinalities—especially with incremental side-information sizes and strict per-client security—necessitates recursive, level-wise allocation and coding decisions not present in the homogeneous setting.

The unique challenges of heterogeneity motivate new research directions in both code design and analytical characterization for secure DPIC, especially as real distributed systems increasingly diverge from homogeneous idealizations.