Papers
Topics
Authors
Recent
Search
2000 character limit reached

Faster Gradient Methods for Highly-smooth Stochastic Bilevel Optimization

Published 3 Sep 2025 in math.OC, cs.LG, and stat.ML | (2509.02937v1)

Abstract: This paper studies the complexity of finding an $\epsilon$-stationary point for stochastic bilevel optimization when the upper-level problem is nonconvex and the lower-level problem is strongly convex. Recent work proposed the first-order method, F${}2$SA, achieving the $\tilde{\mathcal{O}}(\epsilon{-6})$ upper complexity bound for first-order smooth problems. This is slower than the optimal $\Omega(\epsilon{-4})$ complexity lower bound in its single-level counterpart. In this work, we show that faster rates are achievable for higher-order smooth problems. We first reformulate F$2$SA as approximating the hyper-gradient with a forward difference. Based on this observation, we propose a class of methods F${}2$SA-$p$ that uses $p$th-order finite difference for hyper-gradient approximation and improves the upper bound to $\tilde{\mathcal{O}}(p \epsilon{4-p/2})$ for $p$th-order smooth problems. Finally, we demonstrate that the $\Omega(\epsilon{-4})$ lower bound also holds for stochastic bilevel problems when the high-order smoothness holds for the lower-level variable, indicating that the upper bound of F${}2$SA-$p$ is nearly optimal in the highly smooth region $p = \Omega( \log \epsilon{-1} / \log \log \epsilon{-1})$.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 2 likes about this paper.