Papers
Topics
Authors
Recent
Search
2000 character limit reached

Randomized Feasibility Algorithm with Polyak Steps

Updated 30 January 2026
  • The paper introduces a randomized feasibility algorithm that replaces full projection onto intersected constraints with tractable, sampled Polyak subgradient updates.
  • It employs adaptive, parameter-free step-size strategies to achieve linear convergence in strongly convex cases and optimal sublinear rates for general convex functions.
  • Empirical evaluations on QCQP and SVM tasks demonstrate that the method maintains computational efficiency and competitive performance without rigorous parameter tuning.

A randomized feasibility algorithm with Polyak steps is a class of iterative methods for constrained convex optimization where computationally tractable projections onto each individual constraint set are used instead of direct projection onto the intersection of all constraints. At each iteration, the algorithm randomly samples constraints and projects the current point towards feasibility using subgradient steps of Polyak type. Adaptive, problem-parameter-free step-size rules and sampled constraint selection enable linear or sublinear convergence rates according to the regularity of the objective function, while maintaining computational practicality when the full constraint projection is prohibitive (Chakraborty et al., 27 Jan 2026).

1. Problem Formulation and Notation

The central problem is of the form

minimizef(x)subject toxXY,\text{minimize} \quad f(x) \quad \text{subject to} \quad x \in X \cap Y,

where

  • f:RnRf: \mathbb{R}^n \to \mathbb{R} is convex (possibly strongly convex and/or smooth),
  • %%%%1%%%%, with each gig_i convex,
  • YRnY \subset \mathbb{R}^n is a simple closed convex set (such as a box or Euclidean ball).

Key notations:

  • \|\cdot\| is the Euclidean norm,
  • ΠY[z]\Pi_Y[z] denotes projection onto YY,
  • g+(x):=max{0,g(x)}g^+(x) := \max\{0, g(x)\},
  • dist(x,XY):=minyXYxy\text{dist}(x, X \cap Y) := \min_{y \in X \cap Y} \|x - y\|.

A global error bound assumption is used: there exists c>0c > 0 and a sampling distribution ω\omega over {1,,m}\{1,\ldots,m\} such that for all xYx \in Y,

dist2(x,XY)cEω[(gω+(x))2].\text{dist}^2(x, X \cap Y) \le c\,\mathbb{E}_\omega\left[ (g_\omega^+(x))^2 \right].

2. Randomized Feasibility Algorithm with Polyak Steps

The algorithm performs a sequence of feasibility updates, each consisting of NkN_k substeps at iteration kk. Each feasibility substep involves:

  • Sampling a constraint index ω{1,,m}\omega \in \{1,\ldots,m\} uniformly,
  • Computing a subgradient digω+(zi1)d^i \in \partial g_\omega^+(z^{i-1}),
  • Updating via the Polyak-type step: zi=ΠY[zi1βgω+(zi1)di2di],z^i = \Pi_Y\left[ z^{i-1} - \beta \frac{g_\omega^+(z^{i-1})}{\|d^i\|^2} d^i \right], where β(0,2)\beta \in (0,2) is a parameter, and projection is onto YY.

After NkN_k such substeps, xk=zNkx_k = z^{N_k}. This scheme avoids projection onto XYX \cap Y, replacing it with computationally tractable projections onto YY and randomized selection of individual constraints.

Under the error-bound and bounded subgradient assumptions, the following hold:

  • Nonexpansiveness: For any feasible xXYx \in X \cap Y, xkxvkx\|x_k - x\| \le \|v_k - x\|.
  • Geometric decrease in infeasibility: E[dist2(xk,XY)vk](1q)Nkdist2(vk,XY),\mathbb{E}\left[ \text{dist}^2(x_k, X \cap Y) \mid v_k \right] \le (1 - q)^{N_k} \text{dist}^2(v_k, X \cap Y), where q=β(2β)/(cMg2)q = \beta(2-\beta)/(c M_g^2).

3. Interleaved Objective Minimization and Feasibility Updates

The algorithm alternates or interleaves randomized feasibility updates with (sub)gradient steps for objective minimization. Two major cases are considered:

Strongly Convex, LL-Smooth Objective

Assumptions:

  • ff has LL-Lipschitz gradient,
  • ff is μ\mu-strongly convex.

Algorithm steps:

  1. Compute vk+1=ΠY[xkαkf(xk)]v_{k+1} = \Pi_Y[x_k - \alpha_k \nabla f(x_k)],
  2. Update xk+1x_{k+1} using the randomized feasibility algorithm with vk+1,Nk+1v_{k+1}, N_{k+1}.

Adaptive Polyak-type step size: αk=min{12(Lμ),1L,ϵ2f(xk)2},\alpha_k = \min\left\{ \frac{1}{2(L-\mu)}, \frac{1}{L}, \frac{\epsilon}{2\|\nabla f(x_k)\|^2} \right\}, where ϵ\epsilon is a prescribed accuracy.

Weighted averaging is used: xˉk=t=1k(1μˉ)ktαtxts=1k(1μˉ)ksαs,μˉ=min{12(Lμ),1L,ϵ2Mf2}.\bar{x}_k = \frac{\sum_{t=1}^k (1 - \bar{\mu})^{k-t} \alpha_t x_t}{\sum_{s=1}^k (1 - \bar{\mu})^{k-s} \alpha_s}, \quad \bar{\mu} = \min\left\{ \frac{1}{2(L-\mu)}, \frac{1}{L}, \frac{\epsilon}{2 M_f^2} \right\}.

Convex, Possibly Nonsmooth Objective: Distance-over-Weighted-Subgradients (DoWS)

Assumptions:

  • ff is convex (possibly nondifferentiable),
  • YY is convex and bounded with diameter DD.

For TT iterations:

  • Maintain rk=max{rk1,xkx0}r_k = \max\{ r_{k-1}, \|x_k - x_0\| \},
  • pk=pk1+rk2sf(xk)2p_k = p_{k-1} + r_k^2 \|s_f(x_k)\|^2; sf(xk)f(xk)s_f(x_k) \in \partial f(x_k),
  • αk=rk2/pk\alpha_k = r_k^2 / \sqrt{p_k},
  • Compute vk+1=ΠY[xkαksf(xk)]v_{k+1} = \Pi_Y[x_k - \alpha_k s_f(x_k)],
  • Randomized feasibility update as above.

A weighted average output xˉτ\bar{x}_\tau minimizes rk+12/i=1kri2r_{k+1}^2/\sum_{i=1}^k r_i^2.

4. Convergence Guarantees and Theoretical Rates

Strongly Convex, Smooth Case

For adaptive stepsizes as above and exponential weighting,

E[f(xˉk)f(x)]ϵ\mathbb{E}[|f(\bar{x}_k) - f(x^*)|] \le \epsilon

after k=O(log(1/ϵ))k = O(\log(1/\epsilon)) outer iterations, provided the mean reduction in infeasibility per iteration meets a prescribed threshold (Chakraborty et al., 27 Jan 2026).

Convex, Possibly Nonsmooth Case

After TT iterations using DoWS with feasibility, the output xˉτ\bar{x}_\tau satisfies

E[f(xˉτ)f]max{A1(T),min{A2(τ),A3(T)}},\mathbb{E}[|f(\bar{x}_\tau) - f^*|] \le \max\{A_1(T), \min\{A_2(\tau),A_3(T)\}\},

with \begin{align*} A_1(T) &= \frac{2 D M_f}{\sqrt{T}}\left(\frac{D}{r}\right){\frac{2}{T}\ln(e D2/r2)},\ A_2(\tau) &= D M_f \max_{1\le k\le\tau} \mathbb{E}[ (1-q){N_k/2} ],\ A_3(T) &= \frac{D M_f}{T} \left(\frac{D}{r}\right){\frac{2}{T}\ln(e D2/r2)} \sum_{k=1}{\tau} \mathbb{E}[(1-q){N_k/2}], \end{align*} yielding the optimal O(1/T)O(1/\sqrt{T}) rate as TT \to \infty up to sampling-determined terms.

For unbounded YY, a tamed (logarithmically adjusted) variant of the DoWS step-size ensures bounded iterates and the same O(1/T)O(1/\sqrt{T}) expected error rate up to constants that grow logarithmically in TT.

5. Sampling Distribution Regimes and Computational Properties

Performance and theoretical rates depend critically on the sampling distribution of the number of feasibility substeps NkN_k at each outer iteration. For common regimes:

  • Deterministic polynomial growth: Nk=k1/pN_k = \lceil k^{1/p} \rceil ensures that the sum (1q)k1/(2p)\sum (1-q)^{k^{1/(2p)}} is uniformly bounded.
  • Poisson sampling: NkPois(λk)N_k \sim \mathrm{Pois}(\lambda_k) with λkk1/p\lambda_k \approx k^{1/p} yields E[(1q)Nk/2]=exp(λk(11q))\mathbb{E}[(1-q)^{N_k/2}] = \exp(-\lambda_k(1-\sqrt{1-q})), which decays polynomially in kk.
  • Binomial sampling: NkBin(nk,p)N_k \sim \mathrm{Bin}(n_k, p) with nkk1/pn_k \approx k^{1/p} gives similar decay properties.

Sub-polynomial growth of NkN_k suffices to make sampling-driven error negligible at polylogarithmic cost in total feasibility steps.

6. Empirical Evaluation: QCQP and SVM Applications

Simulations were conducted on two canonical classes of problems:

Quadratically Constrained Quadratic Programming (QCQP)

The problem: minx[10,10]10xAx+bxs.t. xCix+uixeix0,i=1..m\min_{x\in [-10,10]^{10}} x^\top A x + b^\top x \quad \text{s.t. } x^\top C_i x + u_i^\top x - e_i^\top x \le 0,\,\, i=1..m was tested in three regimes:

  • (a) Strongly convex A0A \succ 0, known ff^*,
  • (b) Strongly convex, unknown ff^*,
  • (c) Convex A0A \succeq 0, unknown ff^*.

Baselines included the Nedić et al subgradient-projection, Arrow–Hurwicz and Alt-GDA primal-dual schemes, ACVI (ADMM+log-barrier), and CVXPY interior-point.

Key observations:

  • Adaptive Polyak-step algorithm achieved linear convergence in (a), requiring no prior knowledge of strong convexity or smoothness parameters.
  • DoWS and T-DoWS performed competitively in (b), (c), attaining the expected O(1/T)O(1/\sqrt{T}) rate slope.
  • ACVI provided the fastest infeasibility decay but required expensive tuning.

Support Vector Machine (SVM) Soft-Margin Classification

For the SVM problem

minw,b,ξ12w2+Ciξis.t. 1ξiyi(wzi+b)0,ξi0,\min_{w,b,\xi} \frac{1}{2}\|w\|^2 + C \sum_i \xi_i \quad \text{s.t. } 1 - \xi_i - y_i(w^\top z_i + b) \le 0,\,\, \xi_i \ge 0,

the UCI Banknote, Breast-Cancer, and MNIST 3-vs-5 datasets were used. Only DoWS/T-DoWS and primal-dual (Arrow–Hurwicz/Alt-GDA) baselines were compared due to convexity.

Results:

  • DoWS/T-DoWS schemes reduced objective and infeasibility rapidly;
  • Test-set misclassification rates were competitive with cross-validated primal-dual methods;
  • Methods required no parameter tuning.

7. Theoretical Significance and Practical Implications

Randomized feasibility algorithms with Polyak steps provide a rigorously justified, computation-efficient approach to large-scale constrained convex optimization where projection onto intersected constraints is intractable. Theoretical results guarantee:

  • Linear convergence to any prespecified tolerance for strongly convex, LL-smooth ff;
  • Optimal O(1/T)O(1/\sqrt{T}) rates in the convex, potentially nonsmooth setting;
  • Bounded sampling-driven error without demanding hyperparameter tuning or explicit knowledge of problem parameters.

Empirical results indicate practical competitiveness against state-of-the-art first-order and primal-dual methods, particularly when problem structure or scale make conventional projection approaches prohibitively costly (Chakraborty et al., 27 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Randomized Feasibility Algorithm with Polyak Steps.