Papers
Topics
Authors
Recent
Search
2000 character limit reached

Resettable Streaming Model

Updated 6 February 2026
  • Resettable Streaming Model is a framework that supports non-monotonic updates by allowing both incremental increases and resets to zero, essential for handling deletions.
  • It is vital for applications like active resource monitoring and machine unlearning, addressing challenges from adversarial updates and privacy constraints.
  • Advanced algorithmic techniques, such as binary tree mechanisms and SAFE, leverage differential privacy to guarantee accurate, robust statistics with provable error bounds.

The resettable streaming model is a computational framework for streaming algorithms in which the value of each key in a universe can be increased (by increments) or reset to zero at arbitrary points in the input stream. This model, motivated by applications requiring support for deletion, such as active resource monitoring and machine unlearning, generalizes the standard streaming paradigm by enabling non-monotonic updates. Recent research investigates efficient, robust, and theoretically sound algorithms for estimating statistics of interest (e.g., cardinality, moments, soft-sublinear functionals) in the presence of adversarial update sequences, adaptive attacks, and privacy constraints (Cohen et al., 29 Jan 2026, Shen et al., 21 Jul 2025).

1. Formal Foundations of the Resettable Streaming Model

The resettable streaming model operates over a universe of keys U=[n]U = [n] (potentially infinite) and maintains at each time step tt a nonnegative counter vector v(t)R0nv^{(t)} \in \mathbb{R}_{\ge 0}^n representing the state of each key. The stream consists of TT updates, each of which is either:

  • Inc(x,Δ)\mathrm{Inc}(x,\Delta): increment the counter of key xx by Δ0\Delta \geq 0,
  • Reset(x)\mathrm{Reset}(x) (or more generally, ResetP()\mathrm{Reset}_P(\cdot)): reset the counter for key xx, or all xx satisfying predicate PP, to zero.

For any function f:R0R0f: \mathbb{R}_{\ge 0} \rightarrow \mathbb{R}_{\ge 0}, the statistic of interest at time tt is Ft=xUf(vx(t))F_t = \sum_{x \in U} f\left(v^{(t)}_x\right). Examples include the cardinality (f(v)=1v>0f(v) = \mathbf{1}_{v > 0}), sum (f(v)=vf(v) = v), sublinear moments (f(v)=vpf(v) = v^p, p(0,1)p \in (0,1)), and soft-capped statistics (f(v)=T(1ev/T)f(v) = T(1-e^{-v/T})) (Cohen et al., 29 Jan 2026). The resettable streaming model abstracts the online "forgetting" (unlearning) of data points by resetting their contributions to model updates (Shen et al., 21 Jul 2025).

2. Adversarial Robustness and Streaming Unlearning

The model's semantics make it susceptible to adaptive adversarial attacks—scenarios where adversaries exploit knowledge of intermediate outputs to bias sketch-based estimators. Two key attack paradigms are:

  • Re-insertion attack (insertion-only): The adversary inserts a key xx, observes if it is in the sample, and then re-inserts it to force a further bias in sample selection, degrading accuracy.
  • Sample-and-delete attack (resettable): The adversary inserts a key, queries to see if it was sampled, and if so, immediately deletes it, thereby manipulating the statistical properties of the estimator. In the cardinality case, such manipulation can reduce the estimated count to zero while the true number is Ω(T)\Omega(T) (Cohen et al., 29 Jan 2026).

To guarantee correctness under all adaptive sequences, an algorithm is termed adaptively robust if it maintains that, for all tt,

F^tFtεmaxttFt|\hat{F}_t - F_t| \leq \varepsilon \max_{t' \leq t} F_{t'}

where each update may depend on previous outputs.

Unlearning methods in this streaming model treat a sequence of deletion ("forgetting") requests as inducing a nonstationary process; each deletion request changes the effective empirical distribution, challenging both statistical estimation and model update procedures. The streaming-unlearning setting formalized in (Shen et al., 21 Jul 2025) requires models to closely approximate the ideal retrained model at each timestep without ever re-accessing the full original dataset.

3. Algorithmic Frameworks: Privacy and Robustness

Recent advances leverage differential privacy (DP) and continual observation mechanisms to construct adaptively robust sketches:

  • Binary Tree Mechanism: Each update (insertions or deletions/resets) produces a ±1\pm 1 unit, which is aggregated into a prefix-sum via a binary tree structure; each node in the tree adds Laplace noise calibrated to the global sensitivity. This mechanism guarantees εdp\varepsilon_{dp}-DP under unit-level change, providing privacy and shielding the sketch's internal randomness against adaptive attacks (Cohen et al., 29 Jan 2026).
  • The sketch's output (e.g., for cardinality, sum, or more general LpL_p moments) is derived from the tree's noisy aggregate, with accuracy guarantees that hold uniformly for all tt ("prefix-max error") with high probability:

F^tFtεmaxtFt|\hat{F}_t - F_t| \leq \varepsilon \max_{t} F_t

Total space for cardinality estimation is O(ε2log3/2Tlog(T/δ))O(\varepsilon^{-2} \log^{3/2}T \log(T/\delta)); for sum estimation, O(ε2log11/2Tlog2(1/δ))O(\varepsilon^{-2} \log^{11/2} T \log^2(1/\delta)).

  • Streaming Unlearning as Distribution Shift (SAFE): In the context of machine unlearning, the SAFE algorithm formalizes unlearning as adapting to the distribution shift induced by removing points. Distributional ratios are tracked via incrementally updated Gaussian statistics in a random projection space and label marginal counts, maintained efficiently in the streaming setting (Shen et al., 21 Jul 2025).

The following table summarizes principal algorithmic components:

Component Role Source
Binary Tree Mechanism Prefix-sum aggregation with DP noise (Cohen et al., 29 Jan 2026)
Streaming Adjustable-Rate Cardinality/sublinear moment sketching (Cohen et al., 29 Jan 2026)
SAFE Efficient streaming unlearning with regret bounds (Shen et al., 21 Jul 2025)

4. Theoretical Guarantees and Metrics

Robust algorithms for the resettable streaming model are analyzed under adaptive adversaries and nonstationary data. The principal metrics include:

  • Prefix-max error: For all tt simultaneously, the estimation error is proportional to the largest true statistic so far.
  • Space complexity: Polylogarithmic in stream length TT and inverse error, e.g., O(ε2log3/2Tlog(T/δ))O(\varepsilon^{-2} \log^{3/2} T \log(T/\delta)) for cardinality.
  • Dynamic regret: For streaming unlearning, the cumulative discrepancy between the actual and ideal unlearned models, quantified as

E[t=1TRt(wt)Rt(wt)]O(T+VT)\mathbb{E}\left[ \sum_{t=1}^T R_t(w_t) - R_t(w_t^*) \right] \leq O(\sqrt{T} + V_T)

where VTV_T is the cumulative variation in the optimal solutions (i.e., VT=t=1Twtwt12V_T = \sum_{t=1}^T \|w_t^* - w_{t-1}^*\|_2) (Shen et al., 21 Jul 2025). This rate matches best-known nonstationary online optimization bounds even absent convexity.

Empirical evaluation (e.g., on MNIST, CIFAR-10, TinyImageNet) confirms that adaptively robust algorithms match retrain-based gold standards in accuracy and deletion effectiveness, while affording $2$–9×9\times speedups (Shen et al., 21 Jul 2025).

5. Supported Statistics and Extensions

The resettable model supports computation of a wide class of statistics:

  • Cardinality (0\ell_0): Approximate the number of active keys with ε\varepsilon-relative accuracy, resisting adaptive attacks via DP-noised sketching and rate control.
  • Sum (1\ell_1): Employs an entry-threshold scheme using exponential random variables and partitioning the estimate into deterministic (revealed) and bounded-error (uncertain) parts aggregated with the tree mechanism.
  • Bernstein/soft-sublinear statistics: Functions of the form f(w)=0a(t)(1ewt)dtf(w) = \int_0^\infty a(t)(1-e^{-wt})dt can be handled via Laplace transform decompositions into sum and "maxdistinct" estimators, both robustified using the framework above (Cohen et al., 29 Jan 2026).
  • Machine unlearning: The SAFE algorithm tracks class-conditional statistics and marginal label counts under a stream of deletion requests, approximating the retrain solution without access to the original training data (Shen et al., 21 Jul 2025).

6. Connections to Streaming Unlearning and Distributional Shift

The resettable streaming model provides a formal foundation for streaming approaches to machine unlearning. In these approaches, the original dataset D0D_0 defines the initial model w0w_0; successive data removal operations yield sets DtD_t, and the goal is to track an online solution wtw_t closely approximating the true retrained model wt=argminwL(Dt,w)w_t^* = \arg\min_w \mathcal{L}(D_t, w). SAFE interprets these updates as a distributional shift problem and maintains sufficient statistics through efficient updates of Gaussian parameters (means, covariances) and label counts using only the information in current and deleted minibatches, with theoretical guarantees on regret and approximate-unlearning (Shen et al., 21 Jul 2025).

A plausible implication is that the resettable streaming formalism will underpin future work in data privacy, model management, and robust streaming computation in adversarial and dynamic environments. The uniform, prefix-max error guarantees enabled by adaptively robust resettable sketches position the model as the default abstraction when monitoring, deletion, and data right-to-be-forgotten operations must be performed at scale in streaming settings (Cohen et al., 29 Jan 2026, Shen et al., 21 Jul 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Resettable Streaming Model.