Exact Worst-case Performance of First-order Methods for Composite Convex Optimization

Published 23 Dec 2015 in math.OC | (1512.07516v4)

Abstract: We provide a framework for computing the exact worst-case performance of any algorithm belonging to a broad class of oracle-based first-order methods for composite convex optimization, including those performing explicit, projected, proximal, conditional and inexact (sub)gradient steps. We simultaneously obtain tight worst-case guarantees and explicit instances of optimization problems on which the algorithm reaches this worst-case. We achieve this by reducing the computation of the worst-case to solving a convex semidefinite program, generalizing previous works on performance estimation by Drori and Teboulle [13] and the authors [43]. We use these developments to obtain a tighter analysis of the proximal point algorithm and of several variants of fast proximal gradient, conditional gradient, subgradient and alternating projection methods. In particular, we present a new analytical worst-case guarantee for the proximal point algorithm that is twice better than previously known, and improve the standard worst-case guarantee for the conditional gradient method by more than a factor of two. We also show how the optimized gradient method proposed by Kim and Fessler in [22] can be extended by incorporating a projection or a proximal operator, which leads to an algorithm that converges in the worst-case twice as fast as the standard accelerated proximal gradient method [2].

Abstract PDF HTML Upgrade to Chat

Citations (110)

View on Semantic Scholar

Summary

Exact Worst-case Performance of First-order Methods for Composite Convex Optimization

This paper by Taylor, Hendrickx, and Glineur presents a comprehensive framework for assessing the worst-case performance of a wide range of first-order methods (FOMs) employed in composite convex optimization. The focus is on algorithms that utilize oracle-based first-order information, including methods employing explicit, projected, proximal, conditional, and inexact gradient steps. The authors bridge the gap between theory and application by precisely computing worst-case guarantees and identifying specific instances of optimization problems that demonstrate these worst-case scenarios.

The authors extend the performance estimation strategy previously explored by Drori and Teboulle. The innovative contribution here is the reduction of the worst-case performance computation to a convex semidefinite program (SDP), allowing for the generalization of performance estimation to a broader class of algorithms and functions. They further refine the analysis of several algorithms: the proximal point method, the conditional gradient method, and FPGM variants. Particularly, the proximal point method's worst-case guarantee is shown to improve by a factor of two, and the conditional gradient method's standard worst-case guarantee is enhanced by more than a factor of two.

Framework Overview

The core of this research involves formulating the performance estimation problem (PEP) as an SDP, enabling precise worst-case performance computation. The framework supports "black-box" methods that access function data through oracle calls providing first-order information (function value and subgradient). The paper introduces the notion of fixed-step linear first-order methods (FSLFOMs), writing iterations as linear combinations of past iterates and gradient information. This formalization provides a vehicle for leveraging SDPs to establish worst-case analyses.

The SDP-based performance estimation incorporates various function classes such as smooth, strongly convex functions, and bounded domain functions, expanding the applicability of the worst-case analysis across different problem settings. The authors provide a detailed account of how specific problem characteristics (e.g., smoothness, strong convexity) influence formulating the worst-case problem in a tractable manner.

Key Contributions

Improvement in Analysis: The authors refine the proximal point method and conditional gradient methods' analysis by integrating new insights into their structural performance, emphasizing numerical and analytical enhancements over existing results.
Extension of Optimized Gradient Methods: The paper proposes a new extension for the optimized gradient method, which incorporates projection or proximal operator, achieving worst-case performance twice as fast as the standard accelerated proximal gradient method.
Theoretical and Numerical Insights: Through a blend of rigorous theoretical underpinnings and numerical simulations, the paper provides substantial evidence for their claims, contributing to a deeper understanding of first-order methods' limitations and opportunities.

Implications and Future Directions

This research significantly impacts how FOMs are assessed and understood in composite convex optimization. Practically, it allows for the development of algorithms that can be confidently executed with known performance bounds. Theoretically, it opens up exploration into even more refined methods that might leverage similar SDP formulations or extend to even broader classes of convex optimization problems.

The paper also posits the intriguing prospect of further extending SDP formulations to accommodate dynamic step sizes, potentially unlocking new capabilities for adaptive step size rules in first-order methods. This trajectory suggests fruitful future research avenues in optimizing and automating step-size choices to enhance convergence rates without manual tuning.

As performance estimation becomes more sophisticated, its application will undoubtedly broaden, perhaps influencing large-scale optimization and other domains where high-dimensional optimization is prevalent. The framework and results in this research pave the way for both immediate impacts in optimization problem-solving and longer-term innovations in algorithm design.