Balanced Augmented Lagrangian Methods (BALM)

Updated 23 January 2026

BALM is an operator-splitting method for convex programming that decouples the primal proximal mapping from the dual update, enhancing computational efficiency.
It reformulates augmented Lagrangian terms using balancing parameters, enabling efficient handling of large-scale, separable objectives and parallel implementations.
The method guarantees convergence with O(1/k) ergodic rates and supports extensions like dual–primal and adaptive variants for enhanced performance.

Balanced Augmented Lagrangian Methods (BALM) are a class of operator-splitting techniques designed for convex programming with linear equality and/or inequality constraints. Unlike the classical Augmented Lagrangian Method (ALM), which often couples the primal and dual subproblems—leading to computational bottlenecks for large or prox-friendly objectives—BALM systematically "balances" the complexity and conditioning of primal and dual updates. The key innovation is the reformulation and splitting of augmented Lagrangian terms such that the primal step becomes a pure proximal mapping, independent of constraint matrices, while all constraint structure is relegated to the dual step, which typically requires a single well-conditioned linear solve or a small linear complementarity problem. This structure is particularly advantageous for large-scale problems, decomposable/separable objectives, and applications requiring high-performance iterative methods.

1. Mathematical Foundations and Formulation

Consider the canonical convex program of the form: $\min_{x\in\mathcal{X}} f(x) \quad \text{subject to} \quad A x = b, \quad C x \leq d$ where $f: \mathbb{R}^n \to (-\infty, +\infty]$ is closed, proper, and convex, $A \in \mathbb{R}^{m_e \times n}$ and $C \in \mathbb{R}^{m_i \times n}$ encode linear equalities and inequalities, respectively, and $\mathcal{X}$ is the domain of $f$ .

In the classical ALM framework, a penalty parameter $r>0$ controls the strength of quadratic penalization of constraint violations. However, the primal ( $x$ ) update becomes an entangled minimization over the sum $f(x) + \frac{r}{2}\|Ax-b\|^2 + \frac{r}{2}\|[C x-d]_+\|^2$ , which couples $f$ and constraint matrices. This coupling can be particularly inefficient when $f: \mathbb{R}^n \to (-\infty, +\infty]$ 0 is prox-friendly but $f: \mathbb{R}^n \to (-\infty, +\infty]$ 1 or $f: \mathbb{R}^n \to (-\infty, +\infty]$ 2 are large or ill-conditioned.

BALM introduces balancing parameters (e.g., $f: \mathbb{R}^n \to (-\infty, +\infty]$ 3, $f: \mathbb{R}^n \to (-\infty, +\infty]$ 4) and splits the penalty terms. In the one-block, equality-constrained case, set $f: \mathbb{R}^n \to (-\infty, +\infty]$ 5 with $f: \mathbb{R}^n \to (-\infty, +\infty]$ 6 for regularization. BALM iterations are then:

$f: \mathbb{R}^n \to (-\infty, +\infty]$ 7
$f: \mathbb{R}^n \to (-\infty, +\infty]$ 8

In the presence of inequality constraints, a positive definite linear complementarity problem is solved for the dual variable $f: \mathbb{R}^n \to (-\infty, +\infty]$ 9 (He et al., 2021).

The form of the primal update as a pure proximal mapping is preserved regardless of the structure of $A \in \mathbb{R}^{m_e \times n}$ 0, provided $A \in \mathbb{R}^{m_e \times n}$ 1. No requirement on penalty parameter magnitude relative to $A \in \mathbb{R}^{m_e \times n}$ 2 is imposed, in contrast to linearized or classical ALM.

2. Algorithmic Structures and Implementation

The generic algorithmic framework is as follows (He et al., 2021):

Initialization: Choose $A \in \mathbb{R}^{m_e \times n}$ 3, $A \in \mathbb{R}^{m_e \times n}$ 4, initialize primal/dual variables, and precompute $A \in \mathbb{R}^{m_e \times n}$ 5.
At each iteration:
- Compute $A \in \mathbb{R}^{m_e \times n}$ 6
- Primal update: $A \in \mathbb{R}^{m_e \times n}$ 7
- Dual update (equality): $A \in \mathbb{R}^{m_e \times n}$ 8
- Dual update (inequality): Solve LCP for $A \in \mathbb{R}^{m_e \times n}$ 9 as described above

For separable objectives $C \in \mathbb{R}^{m_i \times n}$ 0 and block-structured constraints $C \in \mathbb{R}^{m_i \times n}$ 1, BALM admits parallel (Jacobi or Gauss–Seidel) splitting: $C \in \mathbb{R}^{m_i \times n}$ 2 and a shared dual update

$C \in \mathbb{R}^{m_i \times n}$ 3

which fully decouples $C \in \mathbb{R}^{m_i \times n}$ 4 from $C \in \mathbb{R}^{m_i \times n}$ 5 for $C \in \mathbb{R}^{m_i \times n}$ 6. This property enables scalable decomposed optimization for multi-agent and distributed scenarios.

3. Variational Inequality Perspective and Convergence

BALM and its variants are grounded in a variational inequality (VI) framework. The optimality system is

$C \in \mathbb{R}^{m_i \times n}$ 7

where $C \in \mathbb{R}^{m_i \times n}$ 8 and $C \in \mathbb{R}^{m_i \times n}$ 9 is a monotone (often skew-symmetric) affine operator: $\mathcal{X}$ 0

BALM exploits an $\mathcal{X}$ 1-norm metric contraction: $\mathcal{X}$ 2 which guarantees boundedness, Fejér monotonicity, and convergence to a VI solution under standard assumptions (closed, proper, convex $\mathcal{X}$ 3, existence of solution, Slater’s condition). Ergodic $\mathcal{X}$ 4 rates are established for the averaged iterates: $\mathcal{X}$ 5 with similar structure for block-separable and inequality-constrained scenarios (He et al., 2021, Bai et al., 2021).

4. Extensions: Dual–Primal and Adaptive BALM

Variants inspired by the balancing principle have been introduced:

Dual–Primal BALM (DP-BALM): Swaps the update order. Each iteration first applies a dual correction (via linear system solve), then a primal proximal mapping with an extrapolated anchor, and finally an over-relaxed convex combination of previous and predicted values. This approach maintains global convergence and $\mathcal{X}$ 6 ergodic/pointwise rates without requiring spectral bounds on $\mathcal{X}$ 7 (Xu, 2021).
Adaptive BALM (ABAL): Uses an adaptive step-size rule based on primal/dual progress measures, with provable global convergence under nonstationary Douglas–Rachford frameworks. ABAL maintains low per-iteration complexity and empirically outperforms both constant step-size first-order methods and interior-point solvers on large-scale structured SDP beamforming problems. Detailed application-specific linear algebra optimizations are required for efficient inversion or solving of $\mathcal{X}$ 8, especially in high-dimensional matrix optimization (Wu et al., 2024).

Bregman-variant augmented Lagrangian methods generalize the quadratic penalty to Bregman divergences, encompassing a broader class of proximal points and enabling further acceleration. Accelerated BALM achieves $\mathcal{X}$ 9 ergodic rates under certain geometric assumptions on the divergence (Yan et al., 2020).

5. Computational and Practical Considerations

BALM’s decoupled structure enables several computational benefits:

For large-scale problems with $f$ 0, the primal update is a simple, matrix-free proximal map and the dual update is a moderate-dimensional linear system or LCP.
The dual solve does not require $f$ 1 to exceed any spectral norm of $f$ 2, in contrast to standard ALM.
In block-separable contexts, each $f$ 3 can be handled on independent compute units, and dual variables are updated globally via a modest-size system.
For dense or very large $f$ 4 in the dual step, iterative solvers or preconditioners can replace exact inversion without sacrificing convergence guarantees (Xu, 2021).

Implementation details include precomputing Cholesky factors of $f$ 5, exploiting structure in $f$ 6, and optimizing proximal operators (e.g., using soft-thresholding for $f$ 7 regularization). Adaptive parameters in ABAL (e.g., step-size, dual correction weights) can be tuned based on heuristics or adapted during iterations to enhance empirical performance (Wu et al., 2024).

6. Applications and Performance

BALM and its variants have been applied to convex problems such as:

Large-scale $f$ 8-based sparse recovery (basis pursuit), where DP-BALM and balanced ALM achieve $f$ 9 iterations and few-second runtime for $r>0$ 0 up to $r>0$ 1, outperforming Chambolle–Pock and linearized ALM by 3–5 times in iteration count and 4–5 times in CPU time (Xu, 2021).
Massive-MIMO ISAC beamforming design, where ABAL solved medium-to-large SDP instances (e.g., $r>0$ 2, $r>0$ 3, $r>0$ 4 iterations) 2.8–600 $r>0$ 5 faster than SeDuMi and more efficiently than tuning-free PDHG and constant step-size BALM (Wu et al., 2024).

These results underscore the scalability and applicability of balanced ALM principles to structured, large-dimensional convex optimization.

The main methodological contrasts between BALM and standard ALM (or its linearized/first-order variants) are:

The primal update in BALM is a pure proximal operator, fully decoupled from constraint matrices, facilitating efficient, structure-exploiting or parallel computation.
The dual update, though matrix-dependent, is moderate-dimensional and well-conditioned for suitable $r>0$ 6 or analogous regularization.
No interdependence arises between penalty parameters and spectral properties of constraint matrices.
Multi-block separable and composite problems can be efficiently handled by parallel or Jacobi-type updates.
Global convergence and $r>0$ 7 ergodic rates are preserved, with improved practical conditioning and empirical performance over classical ALM.

Extensions to Bregman divergences, primal–dual hybrid methods, and adaptive parameterizations further broaden the applicability and theoretical impact of the balanced augmented Lagrangian paradigm (Bai et al., 2021, Yan et al., 2020).

References:

Balanced Augmented Lagrangian Method for Convex Programming (He et al., 2021), A dual-primal balanced augmented Lagrangian method for linearly constrained convex programming (Xu, 2021), A New Adaptive Balanced Augmented Lagrangian Method with Application to ISAC Beamforming Design (Wu et al., 2024), A new insight on augmented Lagrangian method with applications in machine learning (Bai et al., 2021), Bregman Augmented Lagrangian and Its Acceleration (Yan et al., 2020).