Papers
Topics
Authors
Recent
Search
2000 character limit reached

Robustness to Subpopulation Shift with Domain Label Noise via Regularized Annotation of Domains

Published 16 Feb 2024 in cs.LG and stat.ML | (2402.11039v2)

Abstract: Existing methods for last layer retraining that aim to optimize worst-group accuracy (WGA) rely heavily on well-annotated groups in the training data. We show, both in theory and practice, that annotation-based data augmentations using either downsampling or upweighting for WGA are susceptible to domain annotation noise, and in high-noise regimes approach the WGA of a model trained with vanilla empirical risk minimization. We introduce Regularized Annotation of Domains (RAD) in order to train robust last layer classifiers without the need for explicit domain annotations. Our results show that RAD is competitive with other recently proposed domain annotation-free techniques. Most importantly, RAD outperforms state-of-the-art annotation-reliant methods even with only 5% noise in the training data for several publicly available datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. Invariant risk minimization. arXiv preprint arXiv:1907.02893, 2019.
  2. Why does throwing away data improve worst-group error? In Proceedings of the 40th International Conference on Machine Learning, 2023.
  3. Just mix once: Mixing samples with implicit group distribution. In NeurIPS 2021 Workshop on Distribution Shifts, 2021.
  4. Simple data balancing achieves competitive worst-group-accuracy. In Proceedings of the First Conference on Causal Learning and Reasoning, 2022.
  5. On feature learning in the presence of spurious correlations. Advances in Neural Information Processing Systems, 2022.
  6. Last layer re-training is sufficient for robustness to spurious correlations. In The Eleventh International Conference on Learning Representations, 2023.
  7. Towards last-layer retraining for group robustness with fewer annotations. In Conference on Neural Information Processing Systems (NeurIPS), 2023.
  8. Just train twice: Improving group robustness without training group information. In Proceedings of the 38th International Conference on Machine Learning, 2021.
  9. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), 2015.
  10. Improving group robustness under noisy labels using predictive uncertainty. ArXiv, abs/2212.07026, 2022.
  11. Distributionally robust language modeling. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019.
  12. Making deep neural networks robust to label noise: A loss correction approach. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.
  13. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 2011.
  14. Simple and fast group robustness by automatic feature reweighting. In Proceedings of the 40th International Conference on Machine Learning, 2023.
  15. Distributionally robust neural networks. In International Conference on Learning Representations, 2020.
  16. Learning with noisy labels revisited: A study using real-world human annotations. In International Conference on Learning Representations, 2022.
  17. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018.
  18. Change is hard: A closer look at subpopulation shift. In International Conference on Machine Learning, 2023.
  19. Improving out-of-distribution robustness via selective augmentation. In Proceedings of the 39th International Conference on Machine Learning, 2022.
Citations (3)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 1 like about this paper.