Difference of Convex Functions Algorithm (DCA)

Updated 29 January 2026

Difference-of-Convex functions Algorithm (DCA) is a method that represents a nonconvex function as the difference of two convex functions to simplify optimization.
The Boosted DC Algorithm (BDCA) enhances classical DCA by incorporating an extrapolation step, leading to significant convergence speedups and robust descent properties.
BDCA is practically effective for linearly constrained DC programming, with rigorous convergence guarantees to KKT points and demonstrated efficiency in applications like quadratic programming and copositivity detection.

A difference-of-convex (DC) function is any function that can be represented as the difference of two convex functions, i.e. $\phi(x) = g(x) - h(x)$ with $g, h$ convex and proper. The Difference of Convex functions Algorithm (DCA) is a foundational method for DC programming, iteratively linearizing the “concave” part and minimizing a convex surrogate. The Boosted DC Algorithm (BDCA) extends classical DCA by incorporating an extrapolation step along the DCA descent direction, typically using a line-search. This simple but effective acceleration mechanism yields provable and substantial improvements in convergence properties and empirical performance, enabling broad applicability to linearly constrained DC programs and quadratic programming with box or trust-region constraints. BDCA guarantees monotonic descent, generalizes to large-scale and nonsmooth settings, and provides rigorous convergence to Karush–Kuhn–Tucker points under Slater-type conditions, with geometric rates in specific quadratic scenarios (Artacho et al., 2019).

1. Problem Formulation and DC Decomposition

The relevant problem class is linearly-constrained DC programming: $\min_{x\in\mathbb{R}^n} \; \phi(x) := g(x) - h(x) \quad\text{s.t.}\;\langle a_i, x\rangle \leq b_i, \; i=1, \dots, p,$ where $g,h:\mathbb{R}^n \rightarrow \mathbb{R}\cup\{+\infty\}$ are proper, closed, convex; $g$ is differentiable, and both $g$ and $h$ are $\rho$ -strongly convex for some $\rho > 0$ . The feasible set is a polyhedron $F = \{x \mid \langle a_i, x \rangle \leq b_i\}$ , and the problem is equivalently

$\min_{x} \{g(x) + \iota_F(x) - h(x)\}$

with $\iota_F$ the indicator of $F$ . Analysis assumes:

(A1) $g$ , $h$ are $\rho$ -strongly convex,
(A2) $h$ subdifferentiable everywhere, $g$ $C^1$ ,
(A3) Slater: there exists $\hat x$ strictly feasible.

2. Classical DCA: Structure and Descent

At each iteration $x_k \in F$ :

Select $u_k \in \partial h(x_k)$ ,
Form the convex surrogate: $\phi_k(x) = g(x) - \langle u_k, x \rangle + \iota_F(x)$ ,
Compute its unique minimizer: $y_k = \arg\min_{x\in F} \{g(x) - \langle u_k, x \rangle\}$ ,
Set $x_{k+1} = y_k$ .

The key property is strong convexity descent: $\phi(y_k) \leq \phi(x_k) - \rho \|y_k - x_k\|^2,$ monotonic decrease of the objective, and $\|y_k - x_k\| \to 0$ .

3. The Boosted DC Algorithm (BDCA): Algorithmic Acceleration

BDCA augments DCA’s surrogate minimization with extrapolation:

Compute $d_k = y_k - x_k$ ,
If $d_k=0$ (criticality), stop.
Check feasibility of extrapolation at $y_k$ (active constraints).
Initialize $\bar \lambda_k \geq 0$ , backtrack $\lambda_k \leftarrow \beta \lambda_k$ (with $\beta\in(0,1)$ ) until

$y_k + \lambda_k d_k \in F, \quad \phi(y_k + \lambda_k d_k) \leq \phi(y_k) - \alpha \lambda_k^2 \|d_k\|^2$

for some $\alpha > 0$ . Else, $\lambda_k = 0$ .

Update $x_{k+1} = y_k + \lambda_k d_k$ .

BDCA pseudocode (see full details (Artacho et al., 2019)) guarantees: $\phi(x_{k+1}) \leq \phi(x_k) - (\rho + \alpha \lambda_k^2)\|d_k\|^2$ and descent holds with finitely many backtracking steps.

4. Convergence Theory: Stationarity and Linear Rates

BDCA exhibits the following under (A1)-(A3) (strong convexity, subdifferentiability, Slater):

Every cluster point is a KKT point for $(\mathcal{P})$ .
The sequence $\{\phi(x_k)\}$ is nonincreasing and convergent.
$\sum \|y_k - x_k\|^2 < +\infty$ .
For quadratic objectives

$\phi(x) = \tfrac{1}{2} \langle Qx, x\rangle + \langle q, x\rangle,$

one can split $g(x) = \tfrac{\sigma}{2} \|x\|^2 + \langle q, x\rangle$ , $h(x) = \tfrac{1}{2}\langle (\sigma I - Q)x, x\rangle$ with $\sigma > \max\{0, \lambda_{\max}(Q)\}$ .

Under Slater, global $R$ -linear convergence:

$\|x_k - \bar x\| \leq C \eta^k, \quad \eta \in (0, 1),$

for some KKT point $\bar x$ .

5. Algorithmic Complexity, Practical Implementation, and Numerical Performance

Empirical tests compare DCA and BDCA on three classes:

Copositivity detection ( $\min_{x \geq 0} x^T Q x$ ): BDCA is on average $15\times$ faster, with speedup growing in $n$ .
$\ell_1$ , $\ell_\infty$ trust-region subproblems: BDCA achieves $3.8 \times$ ( $\ell_1$ ) and $3.65\times$ ( $\ell_\infty$ ) speedup.
Piecewise quadratic programs with box constraints: BDCA uniformly outperforms DCA; speedup improves with the number of pieces.

In all cases, BDCA yields greater per-iteration descent, and overhead for feasibility and objective evaluations is offset by significant improvements in convergence. BDCA always produces solutions with at least as small or better objective value compared to DCA. Boosting is activated in $40–80\%$ of iterations depending on the problem type.

Problem Class	BDCA Speedup over DCA	Fraction of Boosted Steps	Scaling with Problem Size
Copositivity	$15\times$	$45\%$	Increases with $n$
$\ell_1$ trust-region	$3.8\times$	$80\%$	Stable across $n$
$\ell_\infty$ trust-region	$3.65\times$	$40\%$	Stable across $n$
Piecewise Quadratic	varies (increases with $m$ )	varies	Grows with number of pieces

6. Theoretical Significance and Practical Guidelines

BDCA generalizes the classical DCA with a robust extrapolation mechanism, resulting in provably stronger descent steps, improved convergence rates, and practical scalability. The global KKT property, rigorous monotonicity, and linear rates in the quadratic and box-constrained cases match those found in best-in-class convex optimization algorithms (Artacho et al., 2019). Whenever the inner DCA subproblem is tractable (e.g. projection onto feasible set), it is highly recommended to adopt BDCA: line-search/extrapolation yields substantial acceleration with no loss in theoretical guarantees. The overhead of feasibility and function checking is negligible compared to gains from larger steps.

7. Extensions, Limitations, and Comparative Perspective

Though BDCA is formulated and proven for linearly constrained, strong-convexity DC programs with smooth surrogates, practical variants exist for nonsmooth and block-structured problems—see the referenced works for extensions. Limitations include reliance on strong convexity for global guarantees, and the need for efficiently computable projections or inner subproblems. In practice, BDCA’s boosting is automatically shut off at criticality, guaranteeing justification of acceleration only when empirically effective. Comparison with classical DCA in large-scale numerical tests validates the advantage of BDCA across a range of domains.

Key Reference:

"The Boosted DC Algorithm for linearly constrained DC programming" (Artacho et al., 2019)

Markdown Report Issue Upgrade to Chat

References (1)

The Boosted DC Algorithm for linearly constrained DC programming (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Difference of Convex functions Algorithm (DCA).