Hierarchical Simplicity Bias of Neural Networks

Published 5 Nov 2023 in cs.LG and cs.CV | (2311.02622v2)

Abstract: Neural networks often exhibit simplicity bias, favoring simpler features over more complex ones, even when both are equally predictive. We introduce a novel method called imbalanced label coupling to explore and extend this simplicity bias across multiple hierarchical levels. Our approach demonstrates that trained networks sequentially consider features of increasing complexity based on their correlation with labels in the training set, regardless of their actual predictive power. For example, in CIFAR-10, simple spurious features can cause misclassifications where most cats are predicted as dogs and most trucks as automobiles. We empirically show that last-layer retraining with target data distribution \citep{kirichenko2022last} is insufficient to fully recover core features when spurious features perfectly correlate with target labels in our synthetic datasets. Our findings deepen the understanding of the implicit biases inherent in neural networks.