Papers
Topics
Authors
Recent
Search
2000 character limit reached

Balancing Gradient and Hessian Queries in Non-Convex Optimization

Published 23 Oct 2025 in math.OC and cs.DS | (2510.20786v1)

Abstract: We develop optimization methods which offer new trade-offs between the number of gradient and Hessian computations needed to compute the critical point of a non-convex function. We provide a method that for any twice-differentiable $f\colon \mathbb Rd \rightarrow \mathbb R$ with $L_2$-Lipschitz Hessian, input initial point with $\Delta$-bounded sub-optimality, and sufficiently small $\epsilon > 0$, outputs an $\epsilon$-critical point, i.e., a point $x$ such that $|\nabla f(x)| \leq \epsilon$, using $\tilde{O}(L_2{1/4} n_H{-1/2}\Delta\epsilon{-9/4})$ queries to a gradient oracle and $n_H$ queries to a Hessian oracle for any positive integer $n_H$. As a consequence, we obtain an improved gradient query complexity of $\tilde{O}(d{1/3}L_2{1/2}\Delta\epsilon{-3/2})$ in the case of bounded dimension and of $\tilde{O}(L_2{3/4}\Delta{3/2}\epsilon{-9/4})$ in the case where we are allowed only a \emph{single} Hessian query. We obtain these results through a more general algorithm which can handle approximate Hessian computations and recovers the state-of-the-art bound of computing an $\epsilon$-critical point with $O(L_1{1/2}L_2{1/4}\Delta\epsilon{-7/4})$ gradient queries provided that $f$ also has an $L_1$-Lipschitz gradient.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 5 likes about this paper.