Zeroth-order Stochastic Cubic Newton Method Revisited
Abstract: This paper studies stochastic minimization of a finite-sum loss $ F (\mathbf{x}) = \frac{1}{N} \sum_{\xi=1}N f(\mathbf{x};\xi) $. In many real-world scenarios, the Hessian matrix of such objectives exhibits a low-rank structure on a batch of data. At the same time, zeroth-order optimization has gained prominence in important applications such as fine-tuning LLMs. Drawing on these observations, we propose a novel stochastic zeroth-order cubic Newton method that leverages the low-rank Hessian structure via a matrix recovery-based estimation technique. Our method circumvents restrictive incoherence assumptions, enabling accurate Hessian approximation through finite-difference queries. Theoretically, we establish that for most real-world problems in $\mathbb{R}n$, $\mathcal{O}\left(\frac{n}{\eta{\frac{7}{2}}}\right)+\widetilde{\mathcal{O}}\left(\frac{n2 }{\eta{\frac{5}{2}}}\right)$ function evaluations suffice to attain a second-order $\eta$-stationary point with high probability. This represents a significant improvement in dimensional dependence over existing methods. This improvement is mostly due to a new Hessian estimator that achieves superior sample complexity; This new Hessian estimation method might be of separate interest. Numerical experiments on matrix recovery and machine learning tasks validate the efficacy and scalability of our approach.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.