Papers
Topics
Authors
Recent
Search
2000 character limit reached

Beyond Random Augmentations: Pretraining with Hard Views

Published 5 Oct 2023 in cs.CV and cs.AI | (2310.03940v6)

Abstract: Self-Supervised Learning (SSL) methods typically rely on random image augmentations, or views, to make models invariant to different transformations. We hypothesize that the efficacy of pretraining pipelines based on conventional random view sampling can be enhanced by explicitly selecting views that benefit the learning progress. A simple yet effective approach is to select hard views that yield a higher loss. In this paper, we propose Hard View Pretraining (HVP), a learning-free strategy that extends random view generation by exposing models to more challenging samples during SSL pretraining. HVP encompasses the following iterative steps: 1) randomly sample multiple views and forward each view through the pretrained model, 2) create pairs of two views and compute their loss, 3) adversarially select the pair yielding the highest loss according to the current model state, and 4) perform a backward pass with the selected pair. In contrast to existing hard view literature, we are the first to demonstrate hard view pretraining's effectiveness at scale, particularly training on the full ImageNet-1k dataset, and evaluating across multiple SSL methods, ConvNets, and ViTs. As a result, HVP sets a new state-of-the-art on DINO ViT-B/16, reaching 78.8% linear evaluation accuracy (a 0.6% improvement) and consistent gains of 1% for both 100 and 300 epoch pretraining, with similar improvements across transfer tasks in DINO, SimSiam, iBOT, and SimCLR.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Food-101 – mining discriminative components with random forests. In Proc. of ECCV’14, 2014.
  2. Unsupervised learning of visual features by contrasting cluster assignments. In Proc. of NeurIPS’20, 2020.
  3. Emerging properties in self-supervised vision transformers. In Proc. of ICCV’21, pp.  9630–9640, 2021.
  4. A simple framework for contrastive learning of visual representations. In Proc. of ICML’20, pp.  1597–1607, 2020a.
  5. X. Chen and K. He. Exploring simple siamese representation learning. In Proc. of CVPR’21, pp.  15750–15758, 2021.
  6. Improved baselines with momentum contrastive learning. CoRR, abs/2003.04297, 2020b.
  7. Autoaugment: Learning augmentation strategies from data. In Proc. of CVPR’19, pp.  113–123, 2019.
  8. ImageNet: A Large-Scale Hierarchical Image Database. In Proc. of CVPR’09, pp.  248–255, 2009.
  9. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:12010.11929, 2020.
  10. Whitening for self-supervised representation learning. In Proc. of ICML’21, pp.  3015–3024, 2021.
  11. The pascal visual object classes (VOC) challenge. In I. J. of Computer Vision (IJCV’10), pp.  303–338, 2010.
  12. J. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, pp.  1189–1232, 2001.
  13. Bootstrap your own latent: A new approach to self-supervised learning. In Proc. of NeurIPS’20, 2020.
  14. Dimensionality reduction by learning an invariant mapping. In Proc. of CVPR’06, pp.  1735–1742, 2006.
  15. Faster autoaugment: Learning augmentation strategies using backpropagation. In Proc. of ECCV’20, pp.  1–16, 2020.
  16. Deep residual learning for image recognition. In Proc. of CVPR’16, pp.  770–778, 2016.
  17. Momentum contrast for unsupervised visual representation learning. In Proc. of CVPR’20, pp.  9726–9735, 2020.
  18. Masked autoencoders are scalable vision learners. In Proc. of CVPR’22, pp.  15979–15988, 2022.
  19. Population based augmentation: Efficient learning of augmentation policy schedules. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, pp. 2731–2741, 2019.
  20. When to learn what: Model-adaptive data augmentation curriculum. CoRR, abs/2309.04747, 2023.
  21. An efficient approach for assessing hyperparameter importance. In Proc. of ICML’14, pp.  754–762, 2014.
  22. iNaturalist 2021 competition dataset. iNaturalist 2021 competition dataset.  https://github.com/visipedia/inat_comp/tree/master/2021, 2021.
  23. Spatial transformer networks. In Proc. of NeurIPS’15, pp.  2017–2025, 2015.
  24. A. Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
  25. Online hyper-parameter learning for auto-augmentation strategy. In Proc. of ICCV’19, pp.  6578–6587, 2019.
  26. Microsoft COCO: common objects in context. In Proc. of ECCV’14, pp.  740–755, 2014.
  27. I. Loshchilov and F. Hutter. Decoupled weight decay regularization. In Proc. of ICLR’19, 2019.
  28. S. Müller and F. Hutter. Trivialaugment: Tuning-free yet state-of-the-art data augmentation. In Proc. of ICCV’21, pp.  774–782, 2021.
  29. M.-E. Nilsback and A. Zisserman. Automated flower classification over a large number of classes. In Proc. of ICVGIP’08, pp.  722–729, 2008.
  30. PyTorch: An imperative style, high-performance deep learning library. In Proc. of NeurIPS’19, pp.  8024–8035, 2019.
  31. S. Purushwalkam and A. Gupta. Demystifying contrastive self-supervised learning: Invariances, augmentations and dataset biases. In Proc. of NeurIPS’20, 2020.
  32. Faster R-CNN: towards real-time object detection with region proposal networks. In Proc. of NeurIPS’15, pp.  91–99, 2015.
  33. Adversarial masking for self-supervised learning. In Proc. of ICML’22, volume 162, pp.  20026–20040, 2022.
  34. Viewmaker networks: Learning views for unsupervised representation learning. In Proc. of ICLR’21, 2021.
  35. Contrastive multiview coding. In Proc. of ECCV’20, pp.  776–794, 2020a.
  36. What makes for good views for contrastive learning? In Proc. of NeurIPS’20, 2020b.
  37. Representation learning with contrastive predictive coding. CoRR, abs/1807.03748, 2018.
  38. On the importance of hyperparameters and data augmentation for self-supervised learning. International Conference on Machine Learning (ICML) 2022 Pre-Training Workshop, 2022.
  39. On mutual information in contrastive learning for visual representations. arXiv:2005.13149 [cs.CV], 2020.
  40. Detectron2. https://github.com/facebookresearch/detectron2, 2019.
  41. Unsupervised feature learning via non-parametric instance-level discrimination. In Proc. of CVPR’18, 2018.
  42. On the algorithmic stability of adversarial training. In Proc. of NeurIPS’21, pp.  26523–26535, 2021.
  43. Large batch training of convolutional networks. arXiv preprint arXiv:1708.03888, 2017.
  44. Barlow twins: Self-supervised learning via redundancy reduction. In Proc. of ICML’21, pp.  12310–12320, 2021.
  45. Adversarial AutoAugment. In Proc. of ICLR’20, 2020.
  46. ibot: Image BERT pre-training with online tokenizer. CoRR, abs/2111.07832, 2021.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.