Papers
Topics
Authors
Recent
Search
2000 character limit reached

Geometry of Critical Sets and Existence of Saddle Branches for Two-layer Neural Networks

Published 26 May 2024 in cs.LG and math.OC | (2405.17501v1)

Abstract: This paper presents a comprehensive analysis of critical point sets in two-layer neural networks. To study such complex entities, we introduce the critical embedding operator and critical reduction operator as our tools. Given a critical point, we use these operators to uncover the whole underlying critical set representing the same output function, which exhibits a hierarchical structure. Furthermore, we prove existence of saddle branches for any critical set whose output function can be represented by a narrower network. Our results provide a solid foundation to the further study of optimization and training behavior of neural networks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. The loss landscape of deep linear neural networks: a second-order analysis. arXiv preprint arXiv:2107.13289v2, 2022.
  2. O. Calin. Deep Learning Architecture: A Mathematical Approach. Springer Series in the Data Sciences. Springer Nature Switzerland AG, 2020.
  3. Y. Cooper. Global minima of overparameterized neural networks. SIAM Journal on Mathematics of Data Science, 3(2):679–691, 2021.
  4. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. NeurIPS, 27:2933–2941, 2014.
  5. K. Fukumizu and S. ichi Amari. Local minima and plateaus in hierarchical structures of multilayer perceptrons. Neural Networks, 13:317–327, 2000.
  6. Semi-flat minima and saddle points by embedding neural networks to overparameterization. NeurIPS, 32, 2019.
  7. A. Hatcher. Algebraic Topology. Cambridge University Press, 2002.
  8. Understanding generalization through visualizations. NeurIPS, Workshop publishing, pages 87–97, 2020.
  9. On the benefit of width for neural networks: Disappearance of basins. SIAM Journal on Optimization, 32(3):1728–1758, 2022.
  10. Q. Nguyen. On connected sublevel sets in deep learning. ICML, page 4790–4799, 2019.
  11. Q. Nguyen and M. Hein. The loss surface of deep and wide neural networks. ICML, 70:2603–2612, 2017.
  12. Geometry of the loss landscape in overparametrized neural networks: Symmetry and invariances. Proceedings of Machine Learning Research, 139, 2021.
  13. I. Skorokhodov and M. Burtsev. Loss landscape sightseeing with multi-point optimization. NeurIPS, Workshop publishing, 2019.
  14. The global landscape of neural networks. Nonconvex Optimization for Signal Processing and Machine Learning, 37(5):95–108, 2020.
  15. Spurious valleys in one-hidden-layer neural network optimization landscapes. Journal of Machine Learning Research, 20(133):1–34, 2019.
  16. Structure and gradient dynamics near global minima of two-layer neural networks. arXiv preprint arXiv:2309.00508, 2023a.
  17. Emebdding principle of loss landscape of deep neural networks. NeurIPS, 34:14848–14859, 2021.
  18. Emebdding principle: a hierarchical structure of loss landscape of deep neural networks. Journal of Machine Learning, 1(1):60–113, 2022.
  19. Optimistic estimate uncovers the potential of nonlinear models. arXiv preprint arXiv:2307.08921, 2023b.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.