Latent Point Collapse on a Low Dimensional Embedding in Deep Neural Network Classifiers
Abstract: The configuration of latent representations plays a critical role in determining the performance of deep neural network classifiers. In particular, the emergence of well-separated class embeddings in the latent space has been shown to improve both generalization and robustness. In this paper, we propose a method to induce the collapse of latent representations belonging to the same class into a single point, which enhances class separability in the latent space while enforcing Lipschitz continuity in the network. We demonstrate that this phenomenon, which we call \textit{latent point collapse}, is achieved by adding a strong $L_2$ penalty on the penultimate-layer representations and is the result of a push-pull tension developed with the cross-entropy loss function. In addition, we show the practical utility of applying this compressing loss term to the latent representations of a low-dimensional linear penultimate layer. The proposed approach is straightforward to implement and yields substantial improvements in discriminative feature embeddings, along with remarkable gains in robustness to input perturbations.
- N. Tishby and N. Zaslavsky, “Deep learning and the information bottleneck principle,” (2015), arXiv:1503.02406 [cs.LG] .
- V. Kothapalli, “Neural collapse: A review on modelling principles and generalization,” (2023), arXiv:2206.04041 [cs.LG] .
- Y. Yang, H. Yuan, X. Li, Z. Lin, P. Torr,  and D. Tao, “Neural collapse inspired feature-classifier alignment for few-shot class incremental learning,” (2023), arXiv:2302.03004 [cs.CV] .
- J. Haas, W. Yolland,  and B. Rabus, “Linking neural collapse and l2 normalization with improved out-of-distribution detection in deep neural networks,” (2023), arXiv:2209.08378 [cs.LG] .
- M. B. Ammar, N. Belkhir, S. Popescu, A. Manzanera,  and G. Franchi, “Neco: Neural collapse based out-of-distribution detection,” (2023), arXiv:2310.06823 [stat.ML] .
- W. Ji, Y. Lu, Y. Zhang, Z. Deng,  and W. J. Su, “An unconstrained layer-peeled perspective on neural collapse,” (2022), arXiv:2110.02796 [cs.LG] .
- T. Tirer and J. Bruna, “Extended unconstrained features model for exploring deep neural collapse,” (2022), arXiv:2202.08087 [cs.LG] .
- T. Ergen and M. Pilanci, CoRR abs/2002.09773 (2020), 2002.09773 .
- J. Zhou, X. Li, T. Ding, C. You, Q. Qu,  and Z. Zhu, “On the optimization landscape of neural collapse under mse loss: Global optimality with unconstrained features,” (2022), arXiv:2203.01238 [cs.LG] .
- T. Tirer, H. Huang,  and J. Niles-Weed, “Perturbation analysis of neural collapse,” (2023), arXiv:2210.16658 [cs.LG] .
- P. Ramachandran, B. Zoph,  and Q. V. Le, “Searching for activation functions,” (2017), arXiv:1710.05941 [cs.NE] .
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” (2017), arXiv:1412.6980 [cs.LG] .
- I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” (2019), arXiv:1711.05101 [cs.LG] .
- K. Lee, K. Lee, H. Lee,  and J. Shin, “A simple unified framework for detecting out-of-distribution samples and adversarial attacks,” (2018), arXiv:1807.03888 [stat.ML] .
- C. S. Sastry and S. Oore, CoRR abs/1912.12510 (2019), 1912.12510 .
- Y. Sun, Y. Ming, X. Zhu,  and Y. Li, “Out-of-distribution detection with deep nearest neighbors,” (2022), arXiv:2204.06507 [cs.LG] .
- J. Yang, P. Wang, D. Zou, Z. Zhou, K. Ding, W. Peng, H. Wang, G. Chen, B. Li, Y. Sun, X. Du, K. Zhou, W. Zhang, D. Hendrycks, Y. Li,  and Z. Liu, “Openood: Benchmarking generalized out-of-distribution detection,” (2022), arXiv:2210.07242 [cs.CV] .
- S.-M. Moosavi-Dezfooli, A. Fawzi,  and P. Frossard, “Deepfool: a simple and accurate method to fool deep neural networks,” (2016), arXiv:1511.04599 [cs.LG] .
- https://github.com/luigisbailo/emergence_binary_encoding.git.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.