Papers
Topics
Authors
Recent
Search
2000 character limit reached

Expanded AO: Enhanced Nonconvex Optimization

Updated 3 June 2026
  • Expanded AO is a nonconvex optimization method that augments classical alternating optimization with targeted subspace escapes to avoid saddle points and poor local minima.
  • It integrates scaling subspaces and restricted joint search strategies, using gradient and Hessian information to improve solution quality in matrix factorization and penalized regression.
  • The technique ensures monotonic descent of the objective and allows for practical, problem-specific customizations, leading to faster convergence and enhanced performance.

Expanded Alternating Optimization (Expanded AO) is a technique for enhancing the performance of classical alternating optimization (AO)—especially for nonconvex problems—by supplementing conventional blockwise minimization with targeted subspace escapes. By judiciously choosing expanded subspaces for search at each iteration, Expanded AO addresses the well-known issue of coordinate descent becoming trapped in saddle points or poor-quality local minima, leading to improved objective values and faster convergence in problems such as matrix factorization and penalized regression (Murdoch et al., 2014).

1. Fundamentals of Alternating Optimization

Classical alternating optimization (AO), also known as blockwise coordinate descent, iteratively minimizes a multivariate nonconvex function f(z1,,zB)f(z_1,\dots,z_B) over one block variable zbz_b at a time, keeping others fixed. The standard AO update at iteration tt for block bb is: zb(t)=argminzbf(z1(t),,zb1(t),zb,zb+1(t1),,zB(t1))z_b^{(t)} = \arg\min_{z_b} f(z_1^{(t)},\dots,z_{b-1}^{(t)},z_b,z_{b+1}^{(t-1)},\dots,z_B^{(t-1)}) This process cycles through all blocks until convergence. However, for nonconvex ff, AO can become stuck at stationary points that are not local minima. For example, in

f(x,y)=(xy)2x2y2f(x,y) = (x-y)^2 - x^2 y^2

the origin (0,0)(0,0) is a saddle—AO cannot escape once stuck.

2. Core Principles of Expanded AO

The central idea in Expanded AO is to augment the search directions considered by AO, temporarily optimizing over multi-dimensional subspaces that contain the original coordinate update direction but introduce additional, problem-informed directions. Two broad strategies are used:

  • Scaling (Perspective Variable) Subspace: At each AO step for variable zbz_b, introduce a scalar parameter vv and jointly optimize:

zbz_b0

This yields a two-dimensional subspace, extending the classical coordinate line to allow for simultaneous scaling of other variables.

  • Restricted Joint Search: Select a subset of blocks and define search directions zbz_b1. In a single step, solve for the optimal coefficients zbz_b2 in:

zbz_b3

where zbz_b4 are binary selectors indicating which blocks participate.

Choosing zbz_b5 can be random or, more effectively, based on greedy problem-specific directions derived from local gradients or Hessians for faster convergence and improved minimization.

3. Generic Expanded AO Algorithmic Workflow

The overall workflow for Expanded AO consists of alternating standard AO cycles with expanded subspace “escape” cycles:

  1. Run standard AO updates to a fixed point.
  2. For each block, perform a scaling subspace search or a joint restricted search as above.
  3. If the updated point achieves an objective decrease exceeding tolerance zbz_b6, repeat; else, declare convergence.

Pseudocode as stated in (Murdoch et al., 2014): tt9 EscapeSteps involves cycling through scaling and/or joint searches over selected subspaces.

4. Applications to Matrix Factorization and Penalized Regression

Expanded AO has been concretely validated on two nonconvex optimization domains:

Matrix Factorization (MF)

Given observed ratings zbz_b7, the standard AO factors the matrix into zbz_b8 and zbz_b9 with ridge regularization: tt0 Expanded AO applies scaling steps (jointly optimizing tt1 or tt2 and a global scalar) or greedy restricted joint updates on small subsets of user/item vectors, yielding faster convergence and lower mean absolute error (MAE).

Penalized Regression (MC+)

In coordinate descent for MC+ regression (Minimax Concave Penalty), Expanded AO introduces joint scaling over subsets of coefficients, or selectively scales variables based on correlation thresholds. This leads to improved objective values and variable-selection accuracy, especially at grid points where coordinate descent is highly suboptimal.

5. Theoretical Properties and Computational Aspects

Each expanded subspace step in Expanded AO strictly reduces the objective, ensuring monotonic descent and convergence to a stationary point. The method does not guarantee global optimality—no such guarantee exists for generic nonconvex tt3—but empirical results show significant improvements over baseline AO in both speed and final objective quality.

Computational overhead per iteration scales with the size and complexity of the chosen escape subspaces. Greedy or problem-adaptive subspaces offer a favorable trade-off between compute cost and optimization progress.

6. Empirical Performance and Observed Benefits

On the Amazon matrix factorization task (tt4K, tt5K, tt6), greedy restricted subspace AO reduced test MAE by up to 0.12 versus baseline and converged in a quarter the iterations of random subspace updates. In MC+ simulations (tt7, tt8), selective scaling steps reduced objective values by 5% and variable-selection error by 2% for a substantial fraction of hyperparameter grid points (Murdoch et al., 2014).

7. Practical Considerations and Customization

Expanded AO is a generic method, but its greatest gains are realized when escape subspaces are customized using problem structure or data-driven heuristics. While random subspaces yield some improvement, greedy direction selection amplifies the algorithm’s advantage in both rate and final objective. The method is compatible with classical AO frameworks and is readily implementable in large-scale scientific computing and machine learning contexts.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Expanded AO.