Papers
Topics
Authors
Recent
Search
2000 character limit reached

Latent Point Collapse on a Low Dimensional Embedding in Deep Neural Network Classifiers

Published 12 Oct 2023 in cs.LG | (2310.08224v5)

Abstract: The configuration of latent representations plays a critical role in determining the performance of deep neural network classifiers. In particular, the emergence of well-separated class embeddings in the latent space has been shown to improve both generalization and robustness. In this paper, we propose a method to induce the collapse of latent representations belonging to the same class into a single point, which enhances class separability in the latent space while enforcing Lipschitz continuity in the network. We demonstrate that this phenomenon, which we call \textit{latent point collapse}, is achieved by adding a strong $L_2$ penalty on the penultimate-layer representations and is the result of a push-pull tension developed with the cross-entropy loss function. In addition, we show the practical utility of applying this compressing loss term to the latent representations of a low-dimensional linear penultimate layer. The proposed approach is straightforward to implement and yields substantial improvements in discriminative feature embeddings, along with remarkable gains in robustness to input perturbations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. N. Tishby and N. Zaslavsky, “Deep learning and the information bottleneck principle,”  (2015), arXiv:1503.02406 [cs.LG] .
  2. V. Kothapalli, “Neural collapse: A review on modelling principles and generalization,”  (2023), arXiv:2206.04041 [cs.LG] .
  3. Y. Yang, H. Yuan, X. Li, Z. Lin, P. Torr,  and D. Tao, “Neural collapse inspired feature-classifier alignment for few-shot class incremental learning,”  (2023), arXiv:2302.03004 [cs.CV] .
  4. J. Haas, W. Yolland,  and B. Rabus, “Linking neural collapse and l2 normalization with improved out-of-distribution detection in deep neural networks,”  (2023), arXiv:2209.08378 [cs.LG] .
  5. M. B. Ammar, N. Belkhir, S. Popescu, A. Manzanera,  and G. Franchi, “Neco: Neural collapse based out-of-distribution detection,”  (2023), arXiv:2310.06823 [stat.ML] .
  6. W. Ji, Y. Lu, Y. Zhang, Z. Deng,  and W. J. Su, “An unconstrained layer-peeled perspective on neural collapse,”  (2022), arXiv:2110.02796 [cs.LG] .
  7. T. Tirer and J. Bruna, “Extended unconstrained features model for exploring deep neural collapse,”  (2022), arXiv:2202.08087 [cs.LG] .
  8. T. Ergen and M. Pilanci, CoRR abs/2002.09773 (2020), 2002.09773 .
  9. J. Zhou, X. Li, T. Ding, C. You, Q. Qu,  and Z. Zhu, “On the optimization landscape of neural collapse under mse loss: Global optimality with unconstrained features,”  (2022), arXiv:2203.01238 [cs.LG] .
  10. T. Tirer, H. Huang,  and J. Niles-Weed, “Perturbation analysis of neural collapse,”  (2023), arXiv:2210.16658 [cs.LG] .
  11. P. Ramachandran, B. Zoph,  and Q. V. Le, “Searching for activation functions,”  (2017), arXiv:1710.05941 [cs.NE] .
  12. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”  (2017), arXiv:1412.6980 [cs.LG] .
  13. I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,”  (2019), arXiv:1711.05101 [cs.LG] .
  14. K. Lee, K. Lee, H. Lee,  and J. Shin, “A simple unified framework for detecting out-of-distribution samples and adversarial attacks,”  (2018), arXiv:1807.03888 [stat.ML] .
  15. C. S. Sastry and S. Oore, CoRR abs/1912.12510 (2019), 1912.12510 .
  16. Y. Sun, Y. Ming, X. Zhu,  and Y. Li, “Out-of-distribution detection with deep nearest neighbors,”  (2022), arXiv:2204.06507 [cs.LG] .
  17. J. Yang, P. Wang, D. Zou, Z. Zhou, K. Ding, W. Peng, H. Wang, G. Chen, B. Li, Y. Sun, X. Du, K. Zhou, W. Zhang, D. Hendrycks, Y. Li,  and Z. Liu, “Openood: Benchmarking generalized out-of-distribution detection,”  (2022), arXiv:2210.07242 [cs.CV] .
  18. S.-M. Moosavi-Dezfooli, A. Fawzi,  and P. Frossard, “Deepfool: a simple and accurate method to fool deep neural networks,”  (2016), arXiv:1511.04599 [cs.LG] .
  19. https://github.com/luigisbailo/emergence_binary_encoding.git.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.