Papers
Topics
Authors
Recent
Search
2000 character limit reached

Gaussian Universality in Neural Network Dynamics with Generalized Structured Input Distributions

Published 1 May 2024 in stat.ML, cond-mat.dis-nn, cond-mat.stat-mech, and cs.LG | (2405.00642v3)

Abstract: Bridging the gap between the practical performance of deep learning and its theoretical foundations often involves analyzing neural networks through stochastic gradient descent (SGD). Expanding on previous research that focused on modeling structured inputs under a simple Gaussian setting, we analyze the behavior of a deep learning system trained on inputs modeled as Gaussian mixtures to better simulate more general structured inputs. Through empirical analysis and theoretical investigation, we demonstrate that under certain standardization schemes, the deep learning model converges toward Gaussian setting behavior, even when the input data follow more complex or real-world distributions. This finding exhibits a form of universality in which diverse structured distributions yield results consistent with Gaussian assumptions, which can support the theoretical understanding of deep learning models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. H. S. Seung, H. Sompolinsky, and N. Tishby, Physical review A 45, 6056 (1992).
  2. A. Engel, Statistical mechanics of learning (Cambridge University Press, 2001).
  3. L. Zdeborová and F. Krzakala, Advances in Physics 65, 453 (2016).
  4. L. Zdeborová, Nature Physics 16, 602 (2020).
  5. S. B. Korada and A. Montanari, IEEE transactions on information theory 57, 2440 (2011).
  6. E. J. Candès and P. Sur, The Annals of Statistics 48, 27 (2020).
  7. Y. LeCun, http://yann. lecun. com/exdb/mnist/  (1998).
  8. D. Saad and S. A. Solla, Physical Review Letters 74, 4337 (1995).
  9. C. Fefferman, S. Mitter, and H. Narayanan, Journal of the American Mathematical Society 29, 983 (2016).
  10. G. E. Hinton and R. R. Salakhutdinov, science 313, 504 (2006).
  11. G. Peyré, Computer vision and image understanding 113, 249 (2009).
  12. M. A. Carreira-Perpinan, IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 1318 (2000).
  13. I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (MIT Press, 2016) http://www.deeplearningbook.org.
  14. M. Gabrié, Journal of Physics A: Mathematical and Theoretical 53, 223002 (2020).
  15. V. A. Marchenko and L. A. Pastur, Matematicheskii Sbornik 114, 507 (1967).

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 4 tweets with 1 like about this paper.