Papers
Topics
Authors
Recent
Search
2000 character limit reached

Any-dimensional equivariant neural networks

Published 10 Jun 2023 in cs.LG, math.RT, and stat.ML | (2306.06327v2)

Abstract: Traditional supervised learning aims to learn an unknown mapping by fitting a function to a set of input-output pairs with a fixed dimension. The fitted function is then defined on inputs of the same dimension. However, in many settings, the unknown mapping takes inputs in any dimension; examples include graph parameters defined on graphs of any size and physics quantities defined on an arbitrary number of particles. We leverage a newly-discovered phenomenon in algebraic topology, called representation stability, to define equivariant neural networks that can be trained with data in a fixed dimension and then extended to accept inputs in any dimension. Our approach is user-friendly, requiring only the network architecture and the groups for equivariance, and can be combined with any training procedure. We provide a simple open-source implementation of our methods and offer preliminary numerical experiments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Moment varieties for mixtures of products. arXiv preprint arXiv:2301.09068, 2023.
  2. Lorentz group equivariant neural network for particle physics. In H. D. III and A. Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 992–1002. PMLR, 13–18 Jul 2020. URL https://proceedings.mlr.press/v119/bogatskiy20a.html.
  3. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478, 2021.
  4. The convex geometry of linear inverse problems. Foundations of Computational mathematics, 12(6):805–849, 2012.
  5. Sym-noetherianity for powers of gl-varieties. arXiv preprint arXiv:2212.05790, 2022.
  6. T. Church and B. Farb. Representation theory and homological stability. Advances in Mathematics, 245:250–314, 2013. ISSN 0001-8708. doi: https://doi.org/10.1016/j.aim.2013.06.016. URL https://www.sciencedirect.com/science/article/pii/S0001870813002259.
  7. FI-modules and stability for representations of symmetric groups. Duke Mathematical Journal, 164(9):1833 – 1910, 2015. doi: 10.1215/00127094-3120274. URL https://doi.org/10.1215/00127094-3120274.
  8. T. Cohen and M. Welling. Group equivariant convolutional networks. In M. F. Balcan and K. Q. Weinberger, editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 2990–2999, New York, New York, USA, 20–22 Jun 2016. PMLR. URL https://proceedings.mlr.press/v48/cohenc16.html.
  9. T. S. Cohen and M. Welling. Steerable CNNs. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=rJQKYt5ll.
  10. J. Draisma. Noetherianity up to Symmetry, pages 33–61. Springer International Publishing, Cham, 2014. ISBN 978-3-319-04870-3. doi: 10.1007/978-3-319-04870-3_2. URL https://doi.org/10.1007/978-3-319-04870-3_2.
  11. B. Farb. Representation stability. arXiv preprint arXiv:1404.4065, 2014.
  12. A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups. In M. Meila and T. Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 3318–3328. PMLR, 18–24 Jul 2021. URL https://proceedings.mlr.press/v139/finzi21a.html.
  13. N. Gadish. Categories of FI type: A unified approach to generalizing representation stability and character polynomials. Journal of Algebra, 480:450–486, 2017. ISSN 0021-8693. doi: https://doi.org/10.1016/j.jalgebra.2017.03.010. URL https://www.sciencedirect.com/science/article/pii/S0021869317301849.
  14. Neural message passing for quantum chemistry. In D. Precup and Y. W. Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1263–1272. PMLR, 06–11 Aug 2017. URL https://proceedings.mlr.press/v70/gilmer17a.html.
  15. C. Gotsman and S. Toledo. On the computation of null spaces of sparse rectangular matrices. SIAM Journal on Matrix Analysis and Applications, 30(2):445–463, 2008. doi: 10.1137/050638369. URL https://doi.org/10.1137/050638369.
  16. M. Hashemi. Enlarging smaller images before inputting into convolutional neural network: zero-padding vs. interpolation. Journal of Big Data, 6(1):1–13, 2019.
  17. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
  18. Physics-informed machine learning. Nature Reviews Physics, 3(6):422–440, 2021.
  19. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In Y. Bengio and Y. LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1412.6980.
  20. R. Kondor and S. Trivedi. On the generalization of equivariance and convolution in neural networks to the action of compact groups. In J. Dy and A. Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 2747–2755. PMLR, 10–15 Jul 2018. URL https://proceedings.mlr.press/v80/kondor18a.html.
  21. Clebsch–gordan nets: a fully fourier space spherical convolutional neural network. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper_files/paper/2018/file/a3fc981af450752046be179185ebc8b5-Paper.pdf.
  22. P. Kowal. Null space of a sparse matrix. https://www.mathworks.com/matlabcentral/fileexchange/11120-null-space-of-a-sparse-matrix, 2006. [Retrieved July 12, 2022].
  23. Imagenet classification with deep convolutional neural networks. Commun. ACM, 60(6):84–90, may 2017. ISSN 0001-0782. doi: 10.1145/3065386. URL https://doi.org/10.1145/3065386.
  24. E. Levin and V. Chandrasekaran. Free descriptions of convex sets. arXiv preprint arXiv:2307.04230, 2023.
  25. What is an equivariant neural network? Notices Amer. Math. Soc., 70(4):619–625, 2023. ISSN 0002-9920. doi: 10.1090/noti2666. URL https://doi.org/10.1090/noti2666.
  26. L. Lovász. Large networks and graph limits, volume 60. American Mathematical Soc., 2012.
  27. Invariant and equivariant graph networks. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=Syx72jC9tm.
  28. Extensions of recurrent neural network language model. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5528–5531, 2011. doi: 10.1109/ICASSP.2011.5947611.
  29. LSQR: An algorithm for sparse linear equations and sparse least squares. ACM Transactions on Mathematical Software (TOMS), 8(1):43–71, 1982.
  30. Cola: Exploiting compositional structure for automatic and efficient numerical linear algebra. arXiv preprint arXiv:2309.03060, 2023.
  31. S. Sam and A. Snowden. GL-equivariant modules over polynomial rings in infinitely many variables. Transactions of the American Mathematical Society, 368(2):1097–1158, 2016.
  32. S. Sam and A. Snowden. Gröbner methods for representations of combinatorial categories. Journal of the American Mathematical Society, 30(1):159–203, 2017.
  33. S. V. Sam. Structures in representation stability. Notices of the American Mathematical Society, 67(1), 2020.
  34. S. V. Sam and A. Snowden. Stability patterns in representation theory. Forum of Mathematics, Sigma, 3:e11, 2015. doi: 10.1017/fms.2015.10.
  35. Parsing natural scenes and natural language with recursive neural networks. In Proceedings of the 28th international conference on machine learning (ICML-11), pages 129–136, 2011.
  36. Geometric deep learning of rna structure. Science, 373(6558):1047–1051, 2021. doi: 10.1126/science.abe5650. URL https://www.science.org/doi/abs/10.1126/science.abe5650.
  37. D. Van Le and T. Römer. Theorems of Carathéodory, Minkowski-Weyl, and Gordan up to symmetry. arXiv preprint arXiv:2110.10657, 2021.
  38. Building powerful and equivariant graph neural networks with structural message-passing. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 14143–14155. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/a32d7eeaae19821fd9ce317f3ce952a7-Paper.pdf.
  39. Scalars are universal: Equivariant machine learning, structured like classical physics. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 28848–28863. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper_files/paper/2021/file/f1b0775946bc0329b35b823b86eeb5f5-Paper.pdf.
  40. Dimensionless machine learning: Imposing exact units equivariance. Journal of Machine Learning Research, 24(109):1–32, 2023. URL http://jmlr.org/papers/v24/22-0680.html.
  41. 3d steerable cnns: Learning rotationally equivariant features in volumetric data. Advances in Neural Information Processing Systems, 31, 2018.
  42. J. Wilson. An introduction to FI–modules and their generalizations. Michigan Representation Stability Week, 2018.
  43. J. C. Wilson. FI𝒲𝒲\mathcal{W}caligraphic_W-modules and stability criteria for representations of classical Weyl groups. Journal of Algebra, 420:269–332, 2014. ISSN 0021-8693. doi: https://doi.org/10.1016/j.jalgebra.2014.08.010. URL https://www.sciencedirect.com/science/article/pii/S0021869314004505.
  44. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 32(1):4–24, 2021. doi: 10.1109/TNNLS.2020.2978386.
  45. Deep sets. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper_files/paper/2017/file/f22e4747da1aa27e363d86d40ff442fe-Paper.pdf.
Citations (6)

Summary

  • The paper introduces a method that leverages representation stability to parameterize networks independent of input dimensions.
  • The paper ensures robust generalization by applying compatibility conditions that regularize network mappings for varying sizes.
  • The paper provides a practical computational framework using algorithms to extend weights and biases, validated by numerical experiments.

Any-Dimensional Equivariant Neural Networks

The paper "Any-Dimensional Equivariant Neural Networks" addresses the challenge of extending the applicability of neural networks beyond fixed-dimensional inputs to accommodate inputs of varying sizes. This approach leverages the concept of representation stability from algebraic topology to construct equivariant neural networks that can be trained in a fixed dimension and extended to any dimension. The method offers a systematic and user-friendly solution to one of the challenges often encountered in supervised learning.

Core Contributions

The work tackles three primary challenges:

  1. Parameterization of Infinite Sequences: The paper demonstrates that equivariant neural networks can be parameterized in a manner independent of input dimensions. This is possible through representation stability, which allows for finite parameterization of layers, thus enabling the networks to handle inputs of any size.
  2. Generalization Across Dimensions: A prevalent issue identified is the failure of networks to generalize correctly when applied to different dimensions. The authors propose a compatibility condition that ensures mappings generalize well, effectively regularizing the structure of the network and enhancing its robustness across various input dimensions.
  3. A Computational Framework: The researchers provide a computational recipe for learning these networks. By utilizing existing algorithms to compute bases and extend weights and biases computationally, they ensure that the model can be applied to arbitrary dimensions efficiently and with practical feasibility.

Theoretical Framework

The paper introduces the concept of Free Neural Networks, which are characterized by sequences of neural networks that can be extended to any dimension. This is achieved through sequences of equivariant mappings and compatibility conditions that establish consistency across dimensions. The approach is grounded in established mathematical principles such as the use of finite generation and presentation degrees which ensure the stability of representations.

Numerical Experiments

The authors present numerical experiments to validate their approach, demonstrating the ability to train networks on fixed-dimensional data and effectively apply them to inputs of varying sizes. Their results show improvements over existing methodologies, particularly in terms of generalization and computational efficiency.

Implications and Future Directions

The implications of this work are significant in the context of AI and machine learning where input dimensions can vary widely, such as in graph-based data or sets of physical particles. The ability to train robust, equivariant networks that adapt to any dimension can enhance applications ranging from protein structure prediction to dynamic system analysis.

For theoretical advancements, the framework opens avenues for further exploration in the any-dimensional statistical learning paradigm. Questions related to generalization errors across dimensions or the statistical complexity specific to these architectures present interesting challenges for future research.

Conclusion

In summary, the paper offers a novel perspective on handling variable-dimensional inputs in neural networks by employing any-dimensional equivariant structures. The authors' method is not only theoretically sound, relying on representation stability, but also practical, providing a clear computational framework for its implementation. As AI continues to tackle complex problems with diverse data structures, such innovations provide valuable tools for expanding the applicability and scalability of machine learning models.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 5 tweets with 273 likes about this paper.

Reddit