Papers
Topics
Authors
Recent
Search
2000 character limit reached

Critical Influence of Overparameterization on Sharpness-aware Minimization

Published 29 Nov 2023 in cs.LG, math.OC, and stat.ML | (2311.17539v5)

Abstract: Sharpness-Aware Minimization (SAM) has attracted considerable attention for its effectiveness in improving generalization in deep neural network training by explicitly minimizing sharpness in the loss landscape. Its success, however, relies on the assumption that there exists sufficient variability of flatness in the solution space-a condition commonly facilitated by overparameterization. Yet, the interaction between SAM and overparameterization has not been thoroughly investigated, leaving a gap in understanding precisely how overparameterization affects SAM. Thus, in this work, we analyze SAM under varying degrees of overparameterization, presenting both empirical and theoretical findings that reveal its critical influence on SAM's effectiveness. First, we conduct extensive numerical experiments across diverse domains, demonstrating that SAM consistently benefits from overparameterization. Next, we attribute this phenomenon to the interplay between the enlarged solution space and increased implicit bias resulting from overparameterization. Furthermore, we show that this effect is particularly pronounced in practical settings involving label noise and sparsity, and yet, sufficient regularization is necessary. Last but not least, we provide other theoretical insights into how overparameterization helps SAM achieve minima with more uniform Hessian moments compared to SGD, and much faster convergence at a linear rate.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.