Adaptively Robust Sketches
- Adaptively robust sketches are streaming algorithms designed for resettable models that use polylog-space and differential privacy to resist adaptive adversaries.
- They replace traditional linear sketches with dedicated Bernoulli sampling and the Binary Tree Mechanism to ensure strong prefix-max accuracy under dynamic updates and resets.
- They enable robust and efficient approximation of various statistics such as cardinality, sum, and Bernstein statistics, supporting advanced data monitoring and unlearning applications.
Adaptively robust sketches are streaming algorithms designed for the @@@@1@@@@, providing polylogarithmic-space, adversary-resilient data summaries for dynamic systems with both increment and reset capabilities. These sketches address fundamental vulnerabilities of standard linear and composable sketching techniques under adaptive adversaries and enable robust approximation of a wide class of statistical queries, including (sub)linear moments and Bernstein statistics, while offering strong prefix-max accuracy guarantees and efficient memory usage (Cohen et al., 29 Jan 2026).
1. Resettable Streaming Model: Formalism and Scope
The resettable streaming model is characterized by a universe of keys , each associated with a nonnegative value , and an update stream with two operations: , incrementing by , and , setting to zero. While the reset operation can be generalized to predicates over the key set, single-key resets are sufficient to establish information-theoretic lower and upper bounds.
At time , the vector naturally defines a broad family of streaming statistics:
where can represent the indicator for nonzero entries (cardinality, ), the identity (sum, ), or any sublinear, soft-concave “Bernstein” function.
This model is especially relevant for applications requiring fine-grained reset or deletion support, such as resource monitoring with deletions and machine unlearning.
2. Vulnerabilities of Classical Sketches under Adaptive Attacks
Classical streaming sketches, including sampling-based and linear sketches, provide low-variance, unbiased estimates for in the non-adaptive (oblivious) setting but are vulnerable to adaptive adversaries. In the adaptive scenario, the adversary can issue updates based on intermediate estimates, exploiting the deterministic or information-leaking properties of the sketch’s internal randomness.
Key attacks include:
- Re-insertion Attack (for Insertion-Only and Bernoulli Sampling): The adversary inserts a key and, if the sample size increases, immediately re-inserts it, reducing the effective probability of being retained—yielding a -fold underestimation of cardinality.
- Sample-and-Delete Attack (with Resets): For each new key, the adversary inserts it, checks the sample, and resets if it is present. The adversary empties the sample even as the underlying set grows, leading to unbounded relative error.
- Lower Bounds for Linear and Composable Sketches: All known union-composable or linear sketches for these statistics are subject to -query universal attacks for sketches of size , and thus require -size to resist adaptive streams of length .
These vulnerabilities render oblivious or linear sketches fundamentally unsuitable for robust streaming in the resettable model.
3. Adaptively Robust Sketching Framework: Differential Privacy and Binary Tree Mechanism
The adaptation to adversarial streaming hinges on two core design choices: abandoning composability and linearity in favor of dedicated sampling sketches, and shielding their internal randomness with differential privacy (DP). The key privacy tool is the Binary Tree Mechanism (BTM) for continual observation.
Fixed-Rate Robust Cardinality Sketch
- Sampling paradigm: Maintain a Bernoulli sample where each active key in enters with independent probability .
- Increment Logging: Instead of directly releasing , release the increments .
- Noisy Aggregation: Feed the increments into the BTM, releasing
with sensitivity and DP parameter .
- Estimate:
Intuitive protection arises from the Laplace noise, ensuring that even with adaptive access to , an adversary cannot infer more than a small multiplicative factor regarding any key’s sampled status.
Error Analysis and Robustness via DP-Generalization
Standard BTM analyses yield, uniformly for all , with probability :
DP-generalization theorems ensure, even under adaptive querying, the difference between and remains tightly bounded (up to additive for all ).
By judicious parameter selection:
yields for all :
where .
Adjustable-Rate Prefix-Max Accuracy
A fixed- scheme requires foreknowledge of . To circumvent this and guarantee
at each , the sketch adaptively halves so sample size never exceeds a fixed budget . Subsampling and corresponding BTM updates ensure continued DP guarantees and error bounds, while maintaining total space.
4. Robustness for Sum and Bernstein Statistics
The framework generalizes to both sum () and Bernstein statistics.
- Resettable Sum () Sketch: Uses a related sampler with clipping, deterministically includes large-value keys, and applies BTM to normalized updates. The resulting sketch achieves
space for prefix-max error
with probability .
- Bernstein Statistics: For any function admitting a Lévy–Khintchine representation,
e.g., for , soft-capping, etc., a known reduction expresses as a sum plus Max-Distinct over randomized mappings. Parallel application of cardinality samplers and robust sum sketches yields overall prefix-max accuracy and space.
5. Error Guarantees and Lower Bounds
An information-theoretic lower bound via set-disjointness precludes pure relative error with sublinear space, even in the resettable model. However, the “prefix-max” error
is simultaneously achievable in polylogarithmic space and often operationally sufficient, provided the target statistic seldom shrinks by more than a constant factor from prior maxima.
6. Technical Lemmas and Concentration under Adaptivity
Three essential technical tools underpin adaptively robust sketches:
- Binary Tree Mechanism Accuracy ([Chan–Shi–Song’11]): For all , with counters and Laplace noise per node, prefix-noisy sums satisfy
- DP-Generalization for Bernoulli Sampling: For any DP mechanism applied to Bernoulli samples, the posterior probability that a specific key is in the sample remains within of the base rate . The absolute bias in mean sample size remains .
- Concentration under Adaptivity: By expressing the sample size as ( i.i.d. Bernoulli), Freedman-style martingale analysis augmented with DP-posterior stability establishes that for all
with probability .
7. Summary of Contributions and Impact
Adaptively robust sketches provide the first polylog-space, provably adversary-resistant streaming algorithms for the resettable model, with strong prefix-max error guarantees for a large class of statistics. The blend of non-composable sampling frameworks and the binary tree mechanism for continual, differentially private release enables these robust properties, sidestepping the impossibility results for conventional sketching. These advances contribute foundational tools for adversarially robust, memory-efficient data processing in streaming applications with support for deletions and unlearning, with immediate implications for areas such as active monitoring and privacy-preserving analytics (Cohen et al., 29 Jan 2026).