2000 character limit reached
Divergence of the ADAM algorithm with fixed-stepsize: a (very) simple example
Published 1 Aug 2023 in cs.LG | (2308.00720v1)
Abstract: A very simple unidimensional function with Lipschitz continuous gradient is constructed such that the ADAM algorithm with constant stepsize, started from the origin, diverges when applied to minimize this function in the absence of noise on the gradient. Divergence occurs irrespective of the choice of the method parameters.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.