On Convergence of Incremental Gradient for Non-Convex Smooth Functions
Abstract: In machine learning and neural network optimization, algorithms like incremental gradient, and shuffle SGD are popular due to minimizing the number of cache misses and good practical convergence behavior. However, their optimization properties in theory, especially for non-convex smooth functions, remain incompletely explored. This paper delves into the convergence properties of SGD algorithms with arbitrary data ordering, within a broad framework for non-convex smooth functions. Our findings show enhanced convergence guarantees for incremental gradient and single shuffle SGD. Particularly if $n$ is the training set size, we improve $n$ times the optimization term of convergence guarantee to reach accuracy $\varepsilon$ from $O(n / \varepsilon)$ to $O(1 / \varepsilon)$.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.