Papers
Topics
Authors
Recent
Search
2000 character limit reached

Zeroth-order Stochastic Cubic Newton Method Revisited

Published 16 Oct 2024 in math.OC | (2410.22357v4)

Abstract: This paper studies stochastic minimization of a finite-sum loss $ F (\mathbf{x}) = \frac{1}{N} \sum_{\xi=1}N f(\mathbf{x};\xi) $. In many real-world scenarios, the Hessian matrix of such objectives exhibits a low-rank structure on a batch of data. At the same time, zeroth-order optimization has gained prominence in important applications such as fine-tuning LLMs. Drawing on these observations, we propose a novel stochastic zeroth-order cubic Newton method that leverages the low-rank Hessian structure via a matrix recovery-based estimation technique. Our method circumvents restrictive incoherence assumptions, enabling accurate Hessian approximation through finite-difference queries. Theoretically, we establish that for most real-world problems in $\mathbb{R}n$, $\mathcal{O}\left(\frac{n}{\eta{\frac{7}{2}}}\right)+\widetilde{\mathcal{O}}\left(\frac{n2 }{\eta{\frac{5}{2}}}\right)$ function evaluations suffice to attain a second-order $\eta$-stationary point with high probability. This represents a significant improvement in dimensional dependence over existing methods. This improvement is mostly due to a new Hessian estimator that achieves superior sample complexity; This new Hessian estimation method might be of separate interest. Numerical experiments on matrix recovery and machine learning tasks validate the efficacy and scalability of our approach.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (4)

Collections

Sign up for free to add this paper to one or more collections.