Papers
Topics
Authors
Recent
Search
2000 character limit reached

GLADMM: Güler-Type Accelerated ADMM

Updated 25 November 2025
  • GLADMM is a family of operator-splitting algorithms for linearly constrained convex optimization that uses negative squared-distance terms to generate momentum.
  • It introduces Güler-type momentum on both primal and dual variables, yielding optimal total convergence and faster partial convergence compared to traditional ADMM methods.
  • The method unifies various acceleration paradigms and demonstrates empirical benefits in applications like compressive sensing and signal recovery under standard smoothness and proximability conditions.

The Güler-type Accelerated Linearized Alternating Direction Method of Multipliers (GLADMM) is a class of operator-splitting algorithms for linearly constrained composite convex optimization that deploys a distinctive acceleration framework leveraging “negative squared distance” terms in its analytical foundation. By introducing Güler-type momentum on both primal and dual variables, GLADMM achieves optimal or near-optimal convergence rates with improved leading-term descent over classical and Nesterov-style accelerated ADMM schemes. The methodology and its analysis unify and generalize many existing acceleration paradigms for first-order constrained convex optimization, offering theoretical and practical enhancements in a range of applications.

1. Problem Formulation and Algorithmic Framework

GLADMM is designed to solve structured two-block convex problems of the form: minxX,yY F(x,y):=f(x)+g(y)subject toByAx=b\min_{x\in X,\,y\in Y}\ F(x, y) := f(x) + g(y) \quad \text{subject to}\quad B\,y - A\,x = b where:

  • XRnX \subseteq \mathbb{R}^n, YRmY \subseteq \mathbb{R}^m are closed convex sets;
  • f:RnRf: \mathbb{R}^n \to \mathbb{R} is convex and LL-smooth, that is, f(x)f(x)Lxx\|\nabla f(x') - \nabla f(x)\| \leq L \|x' - x\|;
  • g:RmRg: \mathbb{R}^m \to \mathbb{R} is closed, proper, convex, and typically proximable (often non-smooth);
  • AR×nA \in \mathbb{R}^{\ell \times n}, BR×mB \in \mathbb{R}^{\ell \times m}, bRb\in\mathbb{R}^\ell.

Algorithmically, GLADMM proceeds by linearizing both the smooth term ff and the quadratic penalty/augmented term around extrapolated (“mirror-descent-style”) points, followed by primal and dual proximal-type updates and multi-sequence Güler-type extrapolations on all variables. The negative squared norm terms, such as xkx^k12-\|x^k-\hat{x}^{k-1}\|^2, that arise in the potential function analysis are directly harnessed to design the extrapolation steps.

The GLADMM iteration can be represented as follows (see (Zhou et al., 21 Nov 2025)):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
for k = 1,…,N−1
    θ_k = 2/(k+1);
    x_mdᵏ = (1−θ_k)*x_agᵏ + θ_k*ẋᵏ;
    x^{k+1} = argmin_{x∈X}   ⟨ ∇f(x_mdᵏ), x ⟩
                               + (λₖ/2) * ‖B ẏᵏ − A x − b‖²
                               + ⟨ẑᵏ, A x⟩
                               + (ηₖ/2) * ‖x − ẋᵏ‖²;
    ẋ^{k+1} = (2−α)*x^{k+1} + (α−1)*ẋᵏ;
    x_ag^{k+1} = (1−θ_k)*x_agᵏ + θ_k*x^{k+1};
    y^{k+1} = argmin_{y∈Y} g(y)  − ⟨ẑᵏ, B y⟩
                               + (τₖ/2)*‖B y − A x^{k+1} − b‖²;
    ẏ^{k+1} = (2−β)*y^{k+1} + (β−1)*ẏᵏ;
    y_ag^{k+1} = (1−θ_k)*y_agᵏ + θ_k*y^{k+1};
    z^{k+1} = ẑᵏ − γₖ*(B y^{k+1} − A x^{k+1} − b);
    ẑ^{k+1} = (1−κ)*ẑᵏ + κ*z^{k+1};
    z_ag^{k+1} = (1−θ_k)*z_agᵏ + θ_k*z^{k+1};
end

Choosing (α,β,κ)<1(\alpha, \beta, \kappa) < 1 is essential for turning negative-term inequalities into a constructive extrapolation mechanism; setting (α,β,κ)=1(\alpha, \beta, \kappa) = 1 collapses all hatted iterates onto the vanilla AL-ADMM, losing the principal acceleration effect (Zhou et al., 21 Nov 2025).

2. Principle of Güler-type Acceleration

Güler-type acceleration stems from leveraging the appearance of negative squared-norm terms (e.g., xkx^k12-\|x^k-\hat{x}^{k-1}\|^2) in the descent inequalities that arise in the Lyapunov or potential function analysis of first-order algorithms. Instead of discarding (or failing to utilize) these negative terms, GLADMM converts them into extrapolation (momentum) steps across the primal and dual updates for xx, yy, zz. This conceptual move is an extension of Güler’s acceleration scheme for the proximal point method and can be viewed as a generalization and systematization of Nesterov’s first acceleration scheme in the context of operator splitting methods for constrained problems (Zhou et al., 21 Nov 2025, Ouyang et al., 2014).

GLADMM thereby transforms the theoretical descent artifact into practical algorithmic momentum, consistently improving the rate constants governing the leading convergence terms.

3. Convergence Properties and Complexity

GLADMM achieves “total” convergence rates, for the combined objective gap and feasibility violation, on par with the best known for its class: F(xagN,yagN)F(x,y)O(1/N)+O(1/N2)|F(x_{\mathrm{ag}}^N, y_{\mathrm{ag}}^N) - F(x^*, y^*)| \leq O(1/N) + O(1/N^2)

ByagNAxagNbO(1/N)+O(1/N2)\|B y_{\mathrm{ag}}^N - A x_{\mathrm{ag}}^N - b\| \leq O(1/N) + O(1/N^2)

However, the partial convergence rate, i.e., of the leading term (from telescoped initial distance squares such as x^1x2\|\hat{x}^1 - x^*\|^2), improves from the O(1/N3/2)O(1/N^{3/2}) rate achieved by existing AL-ADMM schemes [Ouyang et al.] to O(1/N2)O(1/N^2) for GLADMM (Zhou et al., 21 Nov 2025). This gap is a direct outcome of transforming the negative “error” terms into an actionable momentum through multi-sequence Güler-type extrapolation.

An explicit comparison of rates among closely related ADMM acceleration strategies is shown below:

Method Total Convergence Rate Partial Term Rate Extrapolation Style
L-ADMM O(1/N)O(1/N) O(1/N)O(1/N) None
AL-ADMM (Nesterov) O(1/N)O(1/N) O(1/N3/2)O(1/N^{3/2}) Nesterov’s (no negative-squared exploitation)
GLADMM (Güler-type) O(1/N)O(1/N) O(1/N2)O(1/N^{2}) Güler-type (negative-squared term consumption)

GLADMM is therefore unique in attaining the optimal total rate while also delivering the fastest known partial convergence for the feasibility/primal gap (Zhou et al., 21 Nov 2025).

4. Parameter Regimes, Assumptions, and Algorithmic Details

The improved rates of GLADMM are realized via careful parameterization. Essential specifics include:

  • Momentum weight θk=2/(k+1)\theta_k = 2/(k+1) (“mirror-descent” style),
  • Scaling sequences: λk=τk=γN/k\lambda_k = \tau_k = \gamma N / k, γk=(2ξ)γk/(κN)\gamma_k = (2-\xi) \gamma k/(\kappa N), ηk=2L/(αk)\eta_k = 2L / (\alpha k) with ξ[1.5,2)\xi \in [1.5,2), α,β(0,1)\alpha, \beta \in (0, 1), κ>1\kappa > 1 (Zhou et al., 21 Nov 2025).
  • The convexity and smoothness assumptions require ff to be LL-smooth and convex, gg proximable and convex, and both XX and YY to be closed and convex.
  • Setting (α,β,κ)<1(\alpha, \beta, \kappa) < 1 is necessary for harnessing the negative terms as actionable momentum, a feature absent in Nesterov-only accelerations such as AL-ADMM.

The algorithm is applicable not only in bounded but also in unbounded feasible settings, provided the saddle-point of the KKT system exists (Ouyang et al., 2014).

5. Context and Comparison with Other Accelerated Splitting Schemes

GLADMM generalizes and enhances earlier acceleration strategies:

  • Standard and Nesterov-type AL-ADMM [Ouyang et al., 2015], (Li et al., 2016, Ouyang et al., 2014) use Nesterov’s mirror/extrapolation but do not deploy the negative squared-distance terms arising in their Lyapunov analysis. Their partial convergence rate on feasibility is thus worse.
  • Accelerated schemes that rely on parameter adaptation (Xu, 2016), or over-relaxation/PDHG equivalences (Tan, 2016), achieve optimal total rates or, in the presence of strong convexity, can deliver linear convergence by setting over-relaxation θ\theta appropriately.
  • Recent Güler-type or Nesterov-extrapolated linearized ADMM frameworks propose non-ergodic O(1/k2)O(1/k^2) rates under additional block strong convexity (He et al., 2023), but GLADMM achieves the O(1/N2)O(1/N^2) leading term in the composite case with only LL-smoothness in ff and proximability in gg (no block strong convexity needed).

GLADMM also admits a concise, unified construction principle: all three Güler-type accelerations for proximal gradient (GPGM), linearized ALM (GLALM), and linearized ADMM (GLADMM) are realized under this negative-term/momentum-unification paradigm (Zhou et al., 21 Nov 2025).

6. Numerical Performance and Application Domains

Empirical results—e.g., compressive sensing, 1\ell_1-regularized logistic regression, and quadratic programming—demonstrate that GLADMM consistently outperforms both classical L-ADMM and Nesterov-accelerated AL-ADMM (ergodic and non-ergodic variants) with faster decrease in objective and reconstruction error, particularly in non-ergodic (last iterate) measurement (Zhou et al., 21 Nov 2025). For example, in compressive sensing image recovery (Shepp–Logan phantom, n=4096n = 4096, m1230m \approx 1230), GLADMM achieves both rapid objective decay and improved image quality per iteration relative to all baselines.

The potential of GLADMM and its stochastic analogues extends to high-dimensional statistics, large-scale machine learning, and signal processing where efficient convergence in structured, composite settings is critical.

7. Theoretical Significance and Outlook

GLADMM establishes a sharp boundary between acceleration-by-extrapolation and acceleration-by-negative-term-consumption. It affirms that exploiting analytical artifacts (“negative” terms) through algorithmic design yields not just improved constants but fundamentally sharper asymptotic rates for principal error bounds. The technique generalizes to broader classes of first-order operator splitting, including primal-dual and mirror-descent schemes, and thus provides a template for designing future accelerated constrained optimization algorithms (Zhou et al., 21 Nov 2025, Ouyang et al., 2014, He et al., 2023, Tan, 2016).

A plausible implication is that, beyond composite convexity, negative-term-based acceleration may be adapted to settings with block-separable strong convexity or inexact/correlated update structures, offering a blueprint for both deterministic and stochastic large-scale optimization settings in statistics, machine learning, and signal recovery.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Güler-type Accelerated Linearized Alternating Direction Method of Multipliers (GLADMM).