Papers
Topics
Authors
Recent
Search
2000 character limit reached

Difference of Convex Functions Algorithm (DCA)

Updated 29 January 2026
  • Difference-of-Convex functions Algorithm (DCA) is a method that represents a nonconvex function as the difference of two convex functions to simplify optimization.
  • The Boosted DC Algorithm (BDCA) enhances classical DCA by incorporating an extrapolation step, leading to significant convergence speedups and robust descent properties.
  • BDCA is practically effective for linearly constrained DC programming, with rigorous convergence guarantees to KKT points and demonstrated efficiency in applications like quadratic programming and copositivity detection.

A difference-of-convex (DC) function is any function that can be represented as the difference of two convex functions, i.e. ϕ(x)=g(x)h(x)\phi(x) = g(x) - h(x) with g,hg, h convex and proper. The Difference of Convex functions Algorithm (DCA) is a foundational method for DC programming, iteratively linearizing the “concave” part and minimizing a convex surrogate. The Boosted DC Algorithm (BDCA) extends classical DCA by incorporating an extrapolation step along the DCA descent direction, typically using a line-search. This simple but effective acceleration mechanism yields provable and substantial improvements in convergence properties and empirical performance, enabling broad applicability to linearly constrained DC programs and quadratic programming with box or trust-region constraints. BDCA guarantees monotonic descent, generalizes to large-scale and nonsmooth settings, and provides rigorous convergence to Karush–Kuhn–Tucker points under Slater-type conditions, with geometric rates in specific quadratic scenarios (Artacho et al., 2019).

1. Problem Formulation and DC Decomposition

The relevant problem class is linearly-constrained DC programming: minxRn  ϕ(x):=g(x)h(x)s.t.  ai,xbi,  i=1,,p,\min_{x\in\mathbb{R}^n} \; \phi(x) := g(x) - h(x) \quad\text{s.t.}\;\langle a_i, x\rangle \leq b_i, \; i=1, \dots, p, where g,h:RnR{+}g,h:\mathbb{R}^n \rightarrow \mathbb{R}\cup\{+\infty\} are proper, closed, convex; gg is differentiable, and both gg and hh are ρ\rho-strongly convex for some ρ>0\rho > 0. The feasible set is a polyhedron F={xai,xbi}F = \{x \mid \langle a_i, x \rangle \leq b_i\}, and the problem is equivalently

minx{g(x)+ιF(x)h(x)}\min_{x} \{g(x) + \iota_F(x) - h(x)\}

with ιF\iota_F the indicator of FF. Analysis assumes:

  • (A1) gg, hh are ρ\rho-strongly convex,
  • (A2) hh subdifferentiable everywhere, gg C1C^1,
  • (A3) Slater: there exists x^\hat x strictly feasible.

2. Classical DCA: Structure and Descent

At each iteration xkFx_k \in F:

  • Select ukh(xk)u_k \in \partial h(x_k),
  • Form the convex surrogate: ϕk(x)=g(x)uk,x+ιF(x)\phi_k(x) = g(x) - \langle u_k, x \rangle + \iota_F(x),
  • Compute its unique minimizer: yk=argminxF{g(x)uk,x}y_k = \arg\min_{x\in F} \{g(x) - \langle u_k, x \rangle\},
  • Set xk+1=ykx_{k+1} = y_k.

The key property is strong convexity descent: ϕ(yk)ϕ(xk)ρykxk2,\phi(y_k) \leq \phi(x_k) - \rho \|y_k - x_k\|^2, monotonic decrease of the objective, and ykxk0\|y_k - x_k\| \to 0.

3. The Boosted DC Algorithm (BDCA): Algorithmic Acceleration

BDCA augments DCA’s surrogate minimization with extrapolation:

  • Compute dk=ykxkd_k = y_k - x_k,
  • If dk=0d_k=0 (criticality), stop.
  • Check feasibility of extrapolation at yky_k (active constraints).
  • Initialize λˉk0\bar \lambda_k \geq 0, backtrack λkβλk\lambda_k \leftarrow \beta \lambda_k (with β(0,1)\beta\in(0,1)) until

yk+λkdkF,ϕ(yk+λkdk)ϕ(yk)αλk2dk2y_k + \lambda_k d_k \in F, \quad \phi(y_k + \lambda_k d_k) \leq \phi(y_k) - \alpha \lambda_k^2 \|d_k\|^2

for some α>0\alpha > 0. Else, λk=0\lambda_k = 0.

  • Update xk+1=yk+λkdkx_{k+1} = y_k + \lambda_k d_k.

BDCA pseudocode (see full details (Artacho et al., 2019)) guarantees: ϕ(xk+1)ϕ(xk)(ρ+αλk2)dk2\phi(x_{k+1}) \leq \phi(x_k) - (\rho + \alpha \lambda_k^2)\|d_k\|^2 and descent holds with finitely many backtracking steps.

4. Convergence Theory: Stationarity and Linear Rates

BDCA exhibits the following under (A1)-(A3) (strong convexity, subdifferentiability, Slater):

  • Every cluster point is a KKT point for (P)(\mathcal{P}).
  • The sequence {ϕ(xk)}\{\phi(x_k)\} is nonincreasing and convergent.
  • ykxk2<+\sum \|y_k - x_k\|^2 < +\infty.
  • For quadratic objectives

ϕ(x)=12Qx,x+q,x,\phi(x) = \tfrac{1}{2} \langle Qx, x\rangle + \langle q, x\rangle,

one can split g(x)=σ2x2+q,xg(x) = \tfrac{\sigma}{2} \|x\|^2 + \langle q, x\rangle, h(x)=12(σIQ)x,xh(x) = \tfrac{1}{2}\langle (\sigma I - Q)x, x\rangle with σ>max{0,λmax(Q)}\sigma > \max\{0, \lambda_{\max}(Q)\}.

  • Under Slater, global RR-linear convergence:

xkxˉCηk,η(0,1),\|x_k - \bar x\| \leq C \eta^k, \quad \eta \in (0, 1),

for some KKT point xˉ\bar x.

5. Algorithmic Complexity, Practical Implementation, and Numerical Performance

Empirical tests compare DCA and BDCA on three classes:

  • Copositivity detection (minx0xTQx\min_{x \geq 0} x^T Q x): BDCA is on average 15×15\times faster, with speedup growing in nn.
  • 1\ell_1, \ell_\infty trust-region subproblems: BDCA achieves 3.8×3.8 \times (1\ell_1) and 3.65×3.65\times (\ell_\infty) speedup.
  • Piecewise quadratic programs with box constraints: BDCA uniformly outperforms DCA; speedup improves with the number of pieces.

In all cases, BDCA yields greater per-iteration descent, and overhead for feasibility and objective evaluations is offset by significant improvements in convergence. BDCA always produces solutions with at least as small or better objective value compared to DCA. Boosting is activated in 4080%40–80\% of iterations depending on the problem type.

Problem Class BDCA Speedup over DCA Fraction of Boosted Steps Scaling with Problem Size
Copositivity 15×15\times 45%45\% Increases with nn
1\ell_1 trust-region 3.8×3.8\times 80%80\% Stable across nn
\ell_\infty trust-region 3.65×3.65\times 40%40\% Stable across nn
Piecewise Quadratic varies (increases with mm) varies Grows with number of pieces

6. Theoretical Significance and Practical Guidelines

BDCA generalizes the classical DCA with a robust extrapolation mechanism, resulting in provably stronger descent steps, improved convergence rates, and practical scalability. The global KKT property, rigorous monotonicity, and linear rates in the quadratic and box-constrained cases match those found in best-in-class convex optimization algorithms (Artacho et al., 2019). Whenever the inner DCA subproblem is tractable (e.g. projection onto feasible set), it is highly recommended to adopt BDCA: line-search/extrapolation yields substantial acceleration with no loss in theoretical guarantees. The overhead of feasibility and function checking is negligible compared to gains from larger steps.

7. Extensions, Limitations, and Comparative Perspective

Though BDCA is formulated and proven for linearly constrained, strong-convexity DC programs with smooth surrogates, practical variants exist for nonsmooth and block-structured problems—see the referenced works for extensions. Limitations include reliance on strong convexity for global guarantees, and the need for efficiently computable projections or inner subproblems. In practice, BDCA’s boosting is automatically shut off at criticality, guaranteeing justification of acceleration only when empirically effective. Comparison with classical DCA in large-scale numerical tests validates the advantage of BDCA across a range of domains.

Key Reference:

"The Boosted DC Algorithm for linearly constrained DC programming" (Artacho et al., 2019)

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Difference of Convex functions Algorithm (DCA).