A globalization of L-BFGS and the Barzilai-Borwein method for nonconvex unconstrained optimization

Published 8 Jan 2024 in math.OC | (2401.03805v4)

Abstract: We present a modified limited memory BFGS (L-BFGS) method that converges globally and linearly for nonconvex objective functions. Its distinguishing feature is that it turns into L-BFGS if the iterates cluster at a point near which the objective is strongly convex with Lipschitz gradients, thereby inheriting the outstanding effectiveness of the classical method. These strong convergence guarantees are enabled by a novel form of cautious updating, where, among others, it is decided anew in each iteration which of the stored pairs are used for updating and which ones are skipped. In particular, this yields the first modification of cautious updating for which all cluster points are stationary while the spectrum of the L-BFGS operator is not permanently restricted, and this holds without Lipschitz continuity of the gradient. In fact, for Wolfe-Powell line searches we show that continuity of the gradient is sufficient for global convergence, which extends to other descent methods. Since we allow the memory size to be zero in the globalized L-BFGS method, we also obtain a new globalization of the Barzilai-Borwein spectral gradient (BB) method. The convergence analysis is developed in Hilbert space under comparably weak assumptions and covers Armijo and Wolfe-Powell line searches. We illustrate the theoretical findings with numerical experiments. The experiments indicate that if one of the parameters of the cautious updating is chosen sufficiently small, then the modified method agrees entirely with L-BFGS/BB. We also discuss this in the theoretical part. An implementation of the new method is available on arXiv.