Partition Method for BlockRR
- The paper introduces a principled, weight-based partition of label sets to optimize the trade-off between accuracy and privacy in label-differentially private randomized response algorithms.
- The methodology employs a weight matrix derived from private prior estimates and a tunable sharpness parameter to distinguish between majority and minority label blocks.
- Empirical results indicate improved per-class accuracy and effective noise regulation, particularly under conditions of imbalanced class distributions.
The Partition Method for BlockRR is a framework for partitioning a label set—based on prior information about label frequencies—to optimize the trade-off between accuracy and privacy in label-differentially private randomized response algorithms. It introduces a principled, weight-based partition of possible labels into “majority” and “minority” blocks, enabling distinct randomized response mechanisms on each and generalizing many previous approaches under one unified scheme (Liu et al., 3 Feb 2026). BlockRR’s partition method is crucial for balancing the injected noise across classes, especially when class distributions are non-uniform or imbalanced, and is mathematically constructed via a weight matrix that encodes prior probabilities and controls the degree of block separation.
1. Purpose and Integration with BlockRR
The central purpose of the partition method in BlockRR is to divide the label set into two subsets, (majority) and (minority), such that high-prior labels receive standard (“diagonal”) randomized response (RR), while low-prior labels are handled using a more noise-uniformized mechanism. The partitioning ensures that the randomization preserves utility for common classes without sacrificing privacy, and prevents excessive performance degradation due to label imbalance. After partitioning, BlockRR applies block-specific randomization rules to the four possible regions of the label–privatized label Cartesian product, adapting noise to class support (Liu et al., 3 Feb 2026).
2. Construction of the Weight Matrix
The method starts from a private estimate of the prior distribution over labels, , typically obtained with an -differentially private mechanism (e.g., Laplace mechanism). The partition relies on a weight matrix defined as
where is a tunable sharpness parameter. This assignment ensures that diagonal elements represent the direct support for label , while off-diagonal are exponentially downweighted versions of the prior mass for other labels. The matrix captures both the global class balance and label locality, making it suitable for discriminating between well-supported and rare classes (Liu et al., 3 Feb 2026).
3. Mathematical Formulation of the Partition
The partition method selects as the set of labels for which the diagonal weight dominates all off-diagonal entries in its row,
and . This formalizes the notion of “majority” labels without requiring arbitrary thresholds; it is entirely dictated by the estimated prior and sharpness parameter . A block-ID function encodes this mapping. The resulting split determines which labels receive more protective noise injections versus which are granted more accurate privatization (Liu et al., 3 Feb 2026).
4. Blockwise Randomized Response Mechanism
Once and the corresponding privatized label blocks are determined, BlockRR partitions the response mechanism over four blocks for . Transition probabilities differ across these regions:
- For , a diagonal RR is used, with probability mass for mapped labels and otherwise.
- For other blocks, partially uniformized transition probabilities, parameterized by , are assigned and computed via normalization constraints ensuring a proper probability distribution.
- A distinguished subset (of size ) is used for minority labels to prevent oversmoothing and class collapse; the value of interpolates between conventional RR and prior-weighted RR.
The parameters emerge from normalization equations specific to each block, yielding closed-form solutions (Liu et al., 3 Feb 2026).
5. Algorithmic Recipe and Complexity
The partition method is implemented as follows:
- Compute the weight matrix from and .
- For each , declare if for all ; else assign to .
- Split (the space of output labels) into and using mapping (e.g., top- prior labels for each ).
- Compute as the set of highest-prior labels.
- Solve for by inverting a linear system determined by the support sizes and privacy parameter .
- For each privatization operation, draw according to the transition probabilities determined by the block containing .
The dominant cost is for computing and per privatization. Solving the small linear system and sampling is negligible in comparison (Liu et al., 3 Feb 2026).
6. Theoretical Guarantees and Privacy-Utility Trade-offs
BlockRR’s partitioned mechanism rigorously satisfies -label differential privacy by construction. The composition of partitioned mechanisms is also -label DP under standard parallel composition principles, provided that data splits are disjoint (Liu et al., 3 Feb 2026).
Utility and privacy are controlled primarily by the parameters and :
- Lower broadens , increasing noise on rare classes at the expense of utility.
- Varying interpolates between standard RR and prior-weighted RR, allowing practitioners to externally tune privacy-utility trade-offs.
- Empirical evaluation demonstrates that in high and moderate privacy regimes (), the partition method yields strictly better test and per-class accuracy than unpartitioned methods, especially under class imbalance. In the low-privacy regime (), the method reduces to standard RR without further performance loss (Liu et al., 3 Feb 2026).
7. Implications and Applicability
The partition method for BlockRR unifies a wide range of label-differentially-private randomized response mechanisms within a single parameterized framework. It provides systematic control over the partitioning of label sets, adapting flexibly to the empirical distribution of labels and allowing blockwise customization of noise. This is significant in settings with heavy class imbalance or when fine-grained control over per-class accuracy is required. The partition step is efficient for moderate to large label spaces (). The approach is readily extensible to structured output domains, as the only requirement is the ability to define a block-ID mapping and candidate sets for privatization.
A plausible implication is that further refinement of the weight matrix (e.g., nonlinear weighting, dependency on other statistics) could yield even more flexible or utility-preserving variants of the BlockRR partition, potentially generalizing beyond label DP to other forms of privatization or fairness constraints (Liu et al., 3 Feb 2026).