Adaptive Acceleration Without Strong Convexity Priors Or Restarts

Published 16 Jun 2025 in math.OC | (2506.13033v2)

Abstract: Accelerated optimization methods are typically parametrized by the Lipschitz smoothness and strong convexity parameters of a given problem, which are rarely known in practice. While the Lipschitz smoothness parameter L can be easily estimated by backtracking, directly estimating (i.e., without restarts) the strong convexity parameter m remains an open problem. This paper addresses this challenge by proposing NAG-free, an optimization method that achieves acceleration while estimating m online without priors or restarts. Our method couples Nesterov's accelerated gradient (NAG) method with an inexpensive estimator that does not require additional computation, only storage of one more iterate and gradient. We prove that in the canonical setting of the open problem, NAG-free converges globally at least as fast as gradient descent and locally at an accelerated rate, without resorting to m priors or restarts. We also introduce a complementary estimator for L, symmetric to the m estimator. We provide extensive empirical evidence that this NAG-free heuristic with L and m estimators often achieves acceleration in practice, outperforming restart schemes and traditional accelerated methods parametrized by offline estimates of L and m. Our experiments also suggest that NAG-free might be effective even for nonconvex problems when combined with restarts.

Abstract PDF Upgrade to Chat

Summary

The paper introduces NAG-free, which efficiently estimates the strong convexity parameter online without reliance on predetermined restart schemes.
It couples Nesterov’s accelerated gradient with an inexpensive estimator to dynamically approximate both strong convexity and Lipschitz smoothness parameters.
Extensive experiments show that NAG-free achieves global convergence similar to gradient descent and locally accelerated performance across diverse optimization problems.

Adaptive Acceleration Without Strong Convexity Priors Or Restarts

Introduction

This paper addresses a significant challenge in the optimization domain: estimating strong convexity parameters efficiently without resorting to restart schemes. Traditional accelerated optimization methods rely heavily on problem parameters such as Lipschitz smoothness and strong convexity, which are often unknown in practice. The paper proposes NAG-free, an optimization algorithm that estimates the strong convexity parameter $m$ online without prior information or restarts, overcoming limitations of conventional approaches primarily tied to precise parameter knowledge.

Methodology

NAG-free couples Nesterov's accelerated gradient (NAG) method with an inexpensive estimator requiring minimal additional computation. It leverages the iterates and gradients already computed in standard NAG to approximate the strong convexity parameter efficiently. The paper introduces a complementary estimator for the Lipschitz smoothness parameter $L$ , enabling a fully parameter-free setup where both $L$ and $m$ are estimated online.

Theoretical Results

The theoretical backbone of NAG-free lies in its convergence properties. The paper proves that in the canonical setting where parameters are challenging to determine, NAG-free can achieve global convergence at least as fast as gradient descent, and locally at an accelerated rate. These proofs are supported by arguments involving the behavior of the algorithm near the optimum, emphasizing the power iteration-like behavior that assists rapid convergence.

Key Theoretical Insights:

Global Convergence: NAG-free ensures global convergence equivalent to conventional gradient descent, validated through Lyapunov-based analysis.
Local Acceleration: The methodology mirrors high precision iterations characteristic of accelerated convergence rates, with refined control over parameter estimation.
Parameter Estimation: The notion of effective curvature is central, employing a recurrence formula that dynamically adjusts estimates as iterates progress.

Numerical Experiments

Extensive empirical evidence shows that NAG-free frequently surpasses traditional restart-based methods and conventional accelerated algorithms structured around pre-determined parameters. The paper investigates:

Solving smoothed log-sum-exp and logistic regression problems
Regularizing logistic regression and cubic cost functions
Matrix factorization with nonconvex setups

These experiments demonstrate the robustness and efficiency of NAG-free across diverse problem scales and types, particularly in cases where $m$ and $L$ vary significantly.

Discussion

NAG-free's practical implications are profound, offering a viable alternative for cases where restart schemes are computationally expensive or impractical. By efficiently estimating critical parameters in-situ, NAG-free reduces the guesswork traditionally associated with accelerated optimization methods. Moreover, its demonstrated effectiveness in nonconvex settings extends its applicability beyond strictly convex scenarios, paving the way for more adaptive optimization strategies in diverse applications.

Conclusion and Future Directions

This work positions NAG-free as a leading choice for adaptive acceleration in optimization tasks. The theoretical underpinnings and numerical validations present a compelling case for its utility in handling complex problems with unknown prior parameters. Looking forward, the exploration of alternative algorithms like TMM may yield even stronger convergence guarantees when coupled with effective parameter estimators. Further theoretical developments, especially concerning $L$ -estimation in non-strong convexity domains, can broaden NAG-free's applicability and enhance optimization strategies across machine learning and AI contexts.