Papers
Topics
Authors
Recent
Search
2000 character limit reached

A Variable Sample-size Stochastic Quasi-Newton Method for Smooth and Nonsmooth Stochastic Convex Optimization

Published 15 Apr 2018 in math.OC | (1804.05368v5)

Abstract: Classical theory for quasi-Newton schemes has focused on smooth deterministic unconstrained optimization while recent forays into stochastic convex optimization have largely resided in smooth, unconstrained, and strongly convex regimes. Naturally, there is a compelling need to address nonsmoothness, the lack of strong convexity, and the presence of constraints. Accordingly, this paper presents a quasi-Newton framework that can process merely convex and possibly nonsmooth (but smoothable) stochastic convex problems. We propose a framework that combines iterative smoothing and regularization with a variance-reduced scheme reliant on using increasing sample-sizes of gradients. We make the following contributions. (i) We develop a regularized and smoothed variable sample-size BFGS update (rsL-BFGS) that generates a sequence of Hessian approximations and can accommodate nonsmooth convex objectives by utilizing iterative regularization and smoothing. (ii) In strongly convex regimes with state-dependent noise, the proposed variable sample-size stochastic quasi-Newton scheme admits a non-asymptotic linear rate of convergence while the oracle complexity of computing an $\epsilon$-solution is $\mathcal{O}(\kappa{m+1}/\epsilon)$ where $\kappa$ is the condition number and $m\geq 1$. In nonsmooth (but smoothable) regimes, using Moreau smoothing retains the linear convergence rate. To contend with the possible unavailability of Lipschitzian and strong convexity parameters, we also provide sublinear rates; (iii) In merely convex but smooth settings, the regularized VS-SQN scheme rVS-SQN displays a rate of $\mathcal{O}(1/k{(1-\varepsilon)})$. When the smoothness requirements are weakened, the rate for the regularized and smoothed VS-SQN scheme worsens to $\mathcal{O}(k{-1/3})$. Such statements allow for a state-dependent noise assumption under a quadratic growth property.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.