Heavy-ball Differential Equation Achieves $O(\varepsilon^{-7/4})$ Convergence for Nonconvex Functions
Abstract: First-order optimization methods for nonconvex functions with Lipschitz continuous gradient and Hessian have been extensively studied. State-of-the-art methods for finding an $\varepsilon$-stationary point within $O(\varepsilon{-{7/4}})$ or $\tilde{O}(\varepsilon{-{7/4}})$ gradient evaluations are based on Nesterov's accelerated gradient descent (AGD) or Polyak's heavy-ball (HB) method. However, these algorithms employ additional mechanisms, such as restart schemes and negative curvature exploitation, which complicate their behavior and make it challenging to apply them to more advanced settings (e.g., stochastic optimization). As a first step in investigating whether a simple algorithm with $O(\varepsilon{-{7/4}})$ complexity can be constructed without such additional mechanisms, we study the HB differential equation, a continuous-time analogue of the AGD and HB methods. We prove that its dynamics attain an $\varepsilon$-stationary point within $O(\varepsilon{-{7/4}})$ time.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.