Papers
Topics
Authors
Recent
Search
2000 character limit reached

Local SGD and Federated Averaging Through the Lens of Time Complexity

Published 27 Sep 2025 in math.OC | (2509.23207v1)

Abstract: We revisit the classical Local SGD and Federated Averaging (FedAvg) methods for distributed optimization and federated learning. While prior work has primarily focused on iteration complexity, we analyze these methods through the lens of time complexity, taking into account both computation and communication costs. Our analysis reveals that, despite its favorable iteration complexity, the time complexity of canonical Local SGD is provably worse than that of Minibatch SGD and Hero SGD (locally executed SGD). We introduce a corrected variant, Dual Local SGD, and further improve it by increasing the local step sizes, leading to a new method called Decaying Local SGD. Our analysis shows that these modifications, together with Hero SGD, are optimal in the nonconvex setting (up to logarithmic factors), closing the time complexity gap. Finally, we use these insights to improve the theory of a number of other asynchronous and local methods.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.