On Subsample Size of Quantile-Based Randomized Kaczmarz

Published 21 Jul 2025 in math.NA and cs.NA | (2507.15185v1)

Abstract: Quantile-based randomized Kaczmarz (QRK) was recently introduced to efficiently solve sparsely corrupted linear systems $\mathbf{A} \mathbf{x}^{*+\mathbf{\epsilon}} = \mathbf{b}$ [SIAM J. Matrix Anal. Appl., 43(2), 605-637], where $\mathbf{A}\in \mathbb{R}^{m\times n}$ and $\mathbf{\epsilon}$ is an arbitrary $(\beta m)$-sparse corruption. However, all existing theoretical guarantees for QRK require quantiles to be computed using all $m$ samples (or a subsample of the same order), thus negating the computational advantage of Kaczmarz-type methods. This paper overcomes the bottleneck. We analyze a subsampling QRK, which computes quantiles from $D$ uniformly chosen samples at each iteration. Under some standard scaling assumptions on the coefficient matrix, we show that QRK with subsample size $D\ge\frac{C\log (T)}{\log(1/\beta)}$ linearly converges over the first $T$ iterations with high probability, where $C$ is some absolute constant. This subsample size is a substantial reduction from $O(m)$ in prior results. For instance, it translates into $O(\log(n))$ even if an approximation error of $\exp(-n^2)$ is desired. Intriguingly, our subsample size is also tight up to a multiplicative constant: if $D\le \frac{c\log(T)}{\log(1/\beta)}$ for some constant $c$, the error of the $T$-th iterate could be arbitrarily large with high probability. Numerical results are provided to corroborate our theory.