Adaptive Random Feature Regularization on Fine-tuning Deep Neural Networks
Abstract: While fine-tuning is a de facto standard method for training deep neural networks, it still suffers from overfitting when using small target datasets. Previous methods improve fine-tuning performance by maintaining knowledge of the source datasets or introducing regularization terms such as contrastive loss. However, these methods require auxiliary source information (e.g., source labels or datasets) or heavy additional computations. In this paper, we propose a simple method called adaptive random feature regularization (AdaRand). AdaRand helps the feature extractors of training models to adaptively change the distribution of feature vectors for downstream classification tasks without auxiliary source information and with reasonable computation costs. To this end, AdaRand minimizes the gap between feature vectors and random reference vectors that are sampled from class conditional Gaussian distributions. Furthermore, AdaRand dynamically updates the conditional distribution to follow the currently updated feature extractors and balance the distance between classes in feature spaces. Our experiments show that AdaRand outperforms the other fine-tuning regularization, which requires auxiliary source information and heavy computation costs.
- Analyzing the performance of multilayer neural networks for object recognition. In European conference on computer vision, 2014.
- Christopher M Bishop. Neural networks for pattern recognition. Oxford university press, 1995.
- A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses. In European conference on computer vision, pages 548–564, 2020.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, 2020.
- Catastrophic forgetting meets negative transfer: Batch spectral shrinkage for safe transfer learning. In Advances in Neural Information Processing Systems, 2019.
- Describing textures in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2014.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
- Ica based on a smooth estimation of the differential entropy. In Advances in neural information processing systems, 2008.
- Borrowing treasures from the wealthy: Deep transfer learning through selective joint fine-tuning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1086–1095, 2017.
- Low-shot visual recognition by shrinking and hallucinating features. In Proceedings of the IEEE International Conference on Computer Vision, pages 3018–3027, 2017.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
- Rethinking imagenet pre-training. In International Conference on Computer Vision, pages 4918–4927, 2019.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020.
- Learning deep representations by mutual information estimation and maximization. In International Conference on Learning Representations, 2019.
- 3d object representations for fine-grained categorization. In 4th International IEEE Workshop on 3D Representation and Recognition, Sydney, Australia, 2013.
- Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.
- A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In Advances in neural information processing systems, 2018.
- Explicit inductive bias for transfer learning with convolutional networks. In International Conference on Machine Learning, 2018.
- Delta: Deep learning transfer using feature map with attention for convolutional networks. In International Conference on Learning Representations, 2019.
- Improved fine-tuning by better leveraging pre-training data. 2022a.
- Improved fine-tuning by better leveraging pre-training data. In Advances in Neural Information Processing Systems, 2022b.
- Decoupled weight decay regularization, 2019.
- Fine-grained visual classification of aircraft. arXiv, 2013.
- Automated flower classification over a large number of classes. In Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, 2008.
- Max-mahalanobis linear discriminant analysis networks. In International Conference on Machine Learning, pages 4016–4025. PMLR, 2018.
- Cats and dogs. In IEEE Conference on Computer Vision and Pattern Recognition, 2012.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, 2021.
- Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 2015.
- Transfer learning via â„“1subscriptâ„“1\ell_{1}roman_â„“ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT regularization. In Advances in Neural Information Processing Systems, 2020.
- Shaping deep feature space towards gaussian mixture for visual classification. IEEE transactions on pattern analysis and machine intelligence, 45(2):2430–2444, 2022.
- Caltech-UCSD Birds 200. Technical report, California Institute of Technology, 2010.
- How transferable are features in deep neural networks? In Advances in Neural Information Processing Systems, 2014.
- Co-tuning for transfer learning. Advances in Neural Information Processing Systems, 2020.
- Barlow twins: Self-supervised learning via redundancy reduction. In International Conference on Machine Learning, 2021.
- Improving deep regression with ordinal entropy, 2023.
- Unleashing the power of contrastive self-supervised visual models via contrast-regularized fine-tuning. Advances in Neural Information Processing Systems, 2021.
- Regularizing cnn transfer learning with randomised regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13637–13646, 2020.
- Dr-tune: Improving fine-tuning of pretrained visual models by distribution regularization with semantic calibration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.