Papers
Topics
Authors
Recent
Search
2000 character limit reached

Finite-time Lyapunov exponents of deep neural networks

Published 21 Jun 2023 in cond-mat.dis-nn, cs.LG, and stat.ML | (2306.12548v1)

Abstract: We compute how small input perturbations affect the output of deep neural networks, exploring an analogy between deep networks and dynamical systems, where the growth or decay of local perturbations is characterised by finite-time Lyapunov exponents. We show that the maximal exponent forms geometrical structures in input space, akin to coherent structures in dynamical systems. Ridges of large positive exponents divide input space into different regions that the network associates with different classes. These ridges visualise the geometry that deep networks construct in input space, shedding light on the fundamental mechanisms underlying their learning capabilities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. B Poole, S Lahiri, M Raghu, J Sohl-Dickstein,  and S Ganguli, “Exponential expressivity in deep neural networks through transient chaos,” Advances in neural information processing systems 29 (2016).
  2. B Mehlig, Machine learning with neural networks: an introduction for scientists and engineers (Cambridge University Press, 2021).
  3. M Minsky and S Papert, Perceptrons. An Introduction to Computational Geometry (MIT Press, 1969).
  4. E Ott, Chaos in Dynamical Systems, 2nd ed. (Cambridge University Press, 2002).
  5. I Sutskever, J Martens, G Dahl,  and GE Hinton, “On the importance of initialization and momentum in deep learning,” in Proceedings of the 30th International Conference on Machine Learning - Volume 28 (2013) pp. III–1139–III–1147.
  6. X Glorot and Y Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, Vol. 9, edited by YW Teh and M Titterington (JMLR Workshop and Conference Proceedings, Chia Laguna Resort, Sardinia, Italy, 2010) pp. 249–256.
  7. H Jaeger and H Haas, “Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication,” Science 304, 78–80 (2004).
  8. J Pathak, B Hunt, M Girvan, Z Lu,  and E Ott, “Model-free prediction of large spatiotemporally chaotic systems from data: A reservoir computing approach,” Physical Review Letters 120, 024102 (2018).
  9. L Storm, K Gustavsson,  and B Mehlig, “Constraints on parameter choices for successful time-series prediction with echo-state networks,” Machine Learning: Science and Technology 3, 045021 (2022).
  10. R Vogt, MP Touzel, E Shlizerman,  and G Lajoie, “On Lyapunov exponents for RNNs: Understanding information propagation using dynamical systems tools,”  (2020), arXiv:2006.14123 [cs.LG] .
  11. G Haller and G Yuan, “Lagrangian coherent structures and mixing in two-dimensional turbulence,” Physica D 147, 352–370 (2000).
  12. V Lucarini and T Bódai, “Edge states in the climate system: exploring global instabilities and critical transitions,” Nonlinearity 30, R32 (2017).
  13. M Beneitez, Y Duguet, P Schlatter,  and DS Henningson, “Edge manifold as a Lagrangian coherent structure in a high-dimensional state space,” Phys. Rev. Res. 2, 033258 (2020).
  14. C Truesdell and W Noll, “The Non-Linear Field Theories of Mechanics,” in The Non-Linear Field Theories of Mechanics, Encyclopedia of Physics, Vol. III, edited by S. Flügge (Springer, Berlin, 1965) pp. 1–579.
  15. JL Ren, C Chen, ZY Liu, R Li,  and G Wang, “Plastic dynamics transition between chaotic and self-organized critical states in a glassy metal via a multifractal intermediate,” Phys. Rev. B 86, 134303 (2012).
  16. S Vannitsem, “Predictability of large-scale atmospheric motions: Lyapunov exponents and error dynamics,” Chaos 27, 032101 (2017).
  17. A Uthamacumaran, “A review of dynamical systems approaches for the detection of chaotic attractors in cancer networks,” Patterns 2, 100226 (2021).
  18. A Morozov, K Abbott, K Cuddington, T Francis, G Gellner, A Hastings, Y-C Lai, S Petrovskii, K Scranton,  and ML Zeeman, “Long transients in ecology: Theory and applications,” Phys. Life Rev. 32, 1–40 (2020).
  19. R Ni, NT Ouellette,  and GA Voth, “Alignment of vorticity and rods with Lagrangian fluid stretching in turbulence,” Journal of Fluid Mechanics 743, R3 (2014).
  20. V Bezuglyy, M Wilkinson,  and B Mehlig, “Poincare indices of rheoscopic visualisations,” Europhys. Lett.  (2009).
  21. PL Johnson, SS Hamilton, R Burns,  and C Meneveau, “Analysis of geometrical and statistical features of Lagrangian stretching in turbulent channel flow using a database task-parallel particle tracking algorithm,” Phys. Rev. Fluids 2, 014605 (2017).
  22. JF Gibson, J Halcrow,  and P Cvitanovic, “Visualizing the geometry of state space in plane Couette flow,” Journal of Fluid Mechanics 611, 107–130 (2008).
  23. JP Eckmann and D Ruelle, “Ergodic theory of chaos and strange attractors,” Rev. Mod. Phys. 57, 617–656 (1985).
  24. T Okushima, ‘‘New method for computing finite-time Lyapunov exponents,” Phys. Rev. Lett. 91, 254101 (2003).
  25. G Haller, “A variational theory of hyperbolic Lagrangian coherent structures,” Physica D 240, 574–598 (2011).
  26. S Chen, H He,  and W Su, “Label-aware neural tangent kernel: Toward better generalization and local elasticity,” NIPS 33, 15847–15858 (2020).
  27. A Jacot, F Gabriel,  and C Hongler, “Neural tangent kernel: Convergence and generalization in neural networks,” NIPS 31 (2018).
  28. P Kidger and T Lyons, “Universal approximation with deep narrow networks,”  (2020), arXiv:1905.08539 [cs.LG] .
  29. TM Cover, “Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition,” IEEE Trans. on electronic computers , 326 (1965).
  30. M Wilkinson, V Bezuglyy,  and B Mehlig, “Fingerprints of random flows?” Physics of Fluids 21 (2009), 043304.
  31. Y LeCun, C Cortes,  and CJC Burges, “The MNIST database of handwritten digits,”  yann.lecun.com/exdb/mnist (2018).
  32. D Ciregan, U Meier,  and J Schmidhuber, “Multi-column deep neural networks for image classification,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition (2012) pp. 3642–3649.
  33. J Gawlikowski, CRN Tassi, M Ali, J Lee, M Humt, J Feng, A Kruspe, R Triebel, P Jung, R Roscher, et al., “A survey of uncertainty in deep neural networks,” arXiv:2107.03342  (2021).
  34. B Lakshminarayanan, A Pritzel,  and C Blundell, “Simple and scalable predictive uncertainty estimation using deep ensembles,” NIPS 30 (2017).
  35. L Hoffmann and C Elster, “Deep ensembles from a Bayesian perspective,” arXiv:2105.13283  (2021).
  36. A Madry, A Makelov, L Schmidt, D Tsipras,  and A Vladu, “Towards deep learning models resistant to adversarial attacks,” in International Conference on Learning Representations (2018).
Citations (4)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.