GLADMM: Güler-Type Accelerated ADMM
- GLADMM is a family of operator-splitting algorithms for linearly constrained convex optimization that uses negative squared-distance terms to generate momentum.
- It introduces Güler-type momentum on both primal and dual variables, yielding optimal total convergence and faster partial convergence compared to traditional ADMM methods.
- The method unifies various acceleration paradigms and demonstrates empirical benefits in applications like compressive sensing and signal recovery under standard smoothness and proximability conditions.
The Güler-type Accelerated Linearized Alternating Direction Method of Multipliers (GLADMM) is a class of operator-splitting algorithms for linearly constrained composite convex optimization that deploys a distinctive acceleration framework leveraging “negative squared distance” terms in its analytical foundation. By introducing Güler-type momentum on both primal and dual variables, GLADMM achieves optimal or near-optimal convergence rates with improved leading-term descent over classical and Nesterov-style accelerated ADMM schemes. The methodology and its analysis unify and generalize many existing acceleration paradigms for first-order constrained convex optimization, offering theoretical and practical enhancements in a range of applications.
1. Problem Formulation and Algorithmic Framework
GLADMM is designed to solve structured two-block convex problems of the form: where:
- , are closed convex sets;
- is convex and -smooth, that is, ;
- is closed, proper, convex, and typically proximable (often non-smooth);
- , , .
Algorithmically, GLADMM proceeds by linearizing both the smooth term and the quadratic penalty/augmented term around extrapolated (“mirror-descent-style”) points, followed by primal and dual proximal-type updates and multi-sequence Güler-type extrapolations on all variables. The negative squared norm terms, such as , that arise in the potential function analysis are directly harnessed to design the extrapolation steps.
The GLADMM iteration can be represented as follows (see (Zhou et al., 21 Nov 2025)):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
for k = 1,…,N−1
θ_k = 2/(k+1);
x_mdᵏ = (1−θ_k)*x_agᵏ + θ_k*ẋᵏ;
x^{k+1} = argmin_{x∈X} ⟨ ∇f(x_mdᵏ), x ⟩
+ (λₖ/2) * ‖B ẏᵏ − A x − b‖²
+ ⟨ẑᵏ, A x⟩
+ (ηₖ/2) * ‖x − ẋᵏ‖²;
ẋ^{k+1} = (2−α)*x^{k+1} + (α−1)*ẋᵏ;
x_ag^{k+1} = (1−θ_k)*x_agᵏ + θ_k*x^{k+1};
y^{k+1} = argmin_{y∈Y} g(y) − ⟨ẑᵏ, B y⟩
+ (τₖ/2)*‖B y − A x^{k+1} − b‖²;
ẏ^{k+1} = (2−β)*y^{k+1} + (β−1)*ẏᵏ;
y_ag^{k+1} = (1−θ_k)*y_agᵏ + θ_k*y^{k+1};
z^{k+1} = ẑᵏ − γₖ*(B y^{k+1} − A x^{k+1} − b);
ẑ^{k+1} = (1−κ)*ẑᵏ + κ*z^{k+1};
z_ag^{k+1} = (1−θ_k)*z_agᵏ + θ_k*z^{k+1};
end |
Choosing is essential for turning negative-term inequalities into a constructive extrapolation mechanism; setting collapses all hatted iterates onto the vanilla AL-ADMM, losing the principal acceleration effect (Zhou et al., 21 Nov 2025).
2. Principle of Güler-type Acceleration
Güler-type acceleration stems from leveraging the appearance of negative squared-norm terms (e.g., ) in the descent inequalities that arise in the Lyapunov or potential function analysis of first-order algorithms. Instead of discarding (or failing to utilize) these negative terms, GLADMM converts them into extrapolation (momentum) steps across the primal and dual updates for , , . This conceptual move is an extension of Güler’s acceleration scheme for the proximal point method and can be viewed as a generalization and systematization of Nesterov’s first acceleration scheme in the context of operator splitting methods for constrained problems (Zhou et al., 21 Nov 2025, Ouyang et al., 2014).
GLADMM thereby transforms the theoretical descent artifact into practical algorithmic momentum, consistently improving the rate constants governing the leading convergence terms.
3. Convergence Properties and Complexity
GLADMM achieves “total” convergence rates, for the combined objective gap and feasibility violation, on par with the best known for its class:
However, the partial convergence rate, i.e., of the leading term (from telescoped initial distance squares such as ), improves from the rate achieved by existing AL-ADMM schemes [Ouyang et al.] to for GLADMM (Zhou et al., 21 Nov 2025). This gap is a direct outcome of transforming the negative “error” terms into an actionable momentum through multi-sequence Güler-type extrapolation.
An explicit comparison of rates among closely related ADMM acceleration strategies is shown below:
| Method | Total Convergence Rate | Partial Term Rate | Extrapolation Style |
|---|---|---|---|
| L-ADMM | None | ||
| AL-ADMM (Nesterov) | Nesterov’s (no negative-squared exploitation) | ||
| GLADMM (Güler-type) | Güler-type (negative-squared term consumption) |
GLADMM is therefore unique in attaining the optimal total rate while also delivering the fastest known partial convergence for the feasibility/primal gap (Zhou et al., 21 Nov 2025).
4. Parameter Regimes, Assumptions, and Algorithmic Details
The improved rates of GLADMM are realized via careful parameterization. Essential specifics include:
- Momentum weight (“mirror-descent” style),
- Scaling sequences: , , with , , (Zhou et al., 21 Nov 2025).
- The convexity and smoothness assumptions require to be -smooth and convex, proximable and convex, and both and to be closed and convex.
- Setting is necessary for harnessing the negative terms as actionable momentum, a feature absent in Nesterov-only accelerations such as AL-ADMM.
The algorithm is applicable not only in bounded but also in unbounded feasible settings, provided the saddle-point of the KKT system exists (Ouyang et al., 2014).
5. Context and Comparison with Other Accelerated Splitting Schemes
GLADMM generalizes and enhances earlier acceleration strategies:
- Standard and Nesterov-type AL-ADMM [Ouyang et al., 2015], (Li et al., 2016, Ouyang et al., 2014) use Nesterov’s mirror/extrapolation but do not deploy the negative squared-distance terms arising in their Lyapunov analysis. Their partial convergence rate on feasibility is thus worse.
- Accelerated schemes that rely on parameter adaptation (Xu, 2016), or over-relaxation/PDHG equivalences (Tan, 2016), achieve optimal total rates or, in the presence of strong convexity, can deliver linear convergence by setting over-relaxation appropriately.
- Recent Güler-type or Nesterov-extrapolated linearized ADMM frameworks propose non-ergodic rates under additional block strong convexity (He et al., 2023), but GLADMM achieves the leading term in the composite case with only -smoothness in and proximability in (no block strong convexity needed).
GLADMM also admits a concise, unified construction principle: all three Güler-type accelerations for proximal gradient (GPGM), linearized ALM (GLALM), and linearized ADMM (GLADMM) are realized under this negative-term/momentum-unification paradigm (Zhou et al., 21 Nov 2025).
6. Numerical Performance and Application Domains
Empirical results—e.g., compressive sensing, -regularized logistic regression, and quadratic programming—demonstrate that GLADMM consistently outperforms both classical L-ADMM and Nesterov-accelerated AL-ADMM (ergodic and non-ergodic variants) with faster decrease in objective and reconstruction error, particularly in non-ergodic (last iterate) measurement (Zhou et al., 21 Nov 2025). For example, in compressive sensing image recovery (Shepp–Logan phantom, , ), GLADMM achieves both rapid objective decay and improved image quality per iteration relative to all baselines.
The potential of GLADMM and its stochastic analogues extends to high-dimensional statistics, large-scale machine learning, and signal processing where efficient convergence in structured, composite settings is critical.
7. Theoretical Significance and Outlook
GLADMM establishes a sharp boundary between acceleration-by-extrapolation and acceleration-by-negative-term-consumption. It affirms that exploiting analytical artifacts (“negative” terms) through algorithmic design yields not just improved constants but fundamentally sharper asymptotic rates for principal error bounds. The technique generalizes to broader classes of first-order operator splitting, including primal-dual and mirror-descent schemes, and thus provides a template for designing future accelerated constrained optimization algorithms (Zhou et al., 21 Nov 2025, Ouyang et al., 2014, He et al., 2023, Tan, 2016).
A plausible implication is that, beyond composite convexity, negative-term-based acceleration may be adapted to settings with block-separable strong convexity or inexact/correlated update structures, offering a blueprint for both deterministic and stochastic large-scale optimization settings in statistics, machine learning, and signal recovery.