Genetic Learning for Designing Sim-to-Real Data Augmentations
Abstract: Data augmentations are useful in closing the sim-to-real domain gap when training on synthetic data. This is because they widen the training data distribution, thus encouraging the model to generalize better to other domains. Many image augmentation techniques exist, parametrized by different settings, such as strength and probability. This leads to a large space of different possible augmentation policies. Some policies work better than others for overcoming the sim-to-real gap for specific datasets, and it is unclear why. This paper presents two different interpretable metrics that can be combined to predict how well a certain augmentation policy will work for a specific sim-to-real setting, focusing on object detection. We validate our metrics by training many models with different augmentation policies and showing a strong correlation with performance on real data. Additionally, we introduce GeneticAugment, a genetic programming method that can leverage these metrics to automatically design an augmentation policy for a specific dataset without needing to train a model.
- Keep it simple: Image statistics matching for domain adaptation, 2020.
- Unity perception: Generate synthetic data for computer vision. CoRR, abs/2107.04259, 2021. URL https://arxiv.org/abs/2107.04259.
- Albumentations: fast and flexible image augmentations. ArXiv e-prints, 2018.
- Sensor transfer: Learning optimal sensor effect image augmentation for sim-to-real domain adaptation. IEEE Robotics and Automation Letters, 4(3):2431–2438, 2019.
- Learning domain adaptive object detection with probabilistic teacher. In International Conference on Machine Learning, pages 3040–3055. PMLR, 2022.
- The cityscapes dataset for semantic urban scene understanding. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Autoaugment: Learning augmentation policies from data. 2019. URL https://arxiv.org/pdf/1805.09501.pdf.
- Randaugment: Practical automated data augmentation with a reduced search space. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 18613–18624. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/d85b63ef0ccb114d0a3bb7b7d808028f-Paper.pdf.
- A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Transactions on Evolutionary Computation, 6(2):182–197, 2002. doi: 10.1109/4235.996017.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009. doi: 10.1109/CVPR.2009.5206848.
- The pascal visual object classes (voc) challenge. Int. J. Comput. Vis., 88(2):303–338, 2010.
- DEAP: Evolutionary algorithms made easy. Journal of Machine Learning Research, 13:2171–2175, jul 2012.
- Kubric: A scalable dataset generator. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3749–3761, June 2022.
- Array programming with NumPy. Nature, 585(7825):357–362, Sept. 2020. doi: 10.1038/s41586-020-2649-2. URL https://doi.org/10.1038/s41586-020-2649-2.
- Deep Residual Learning for Image Recognition. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’16, pages 770–778. IEEE, June 2016. doi: 10.1109/CVPR.2016.90. URL http://ieeexplore.ieee.org/document/7780459.
- Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Oct 2017.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper_files/paper/2017/file/8a1d694707eb0fefe65871369074926d-Paper.pdf.
- On pre-trained image features and synthetic images for deep learning. In Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8-14, 2018, Proceedings, Part I, page 682–697, Berlin, Heidelberg, 2019. Springer-Verlag. ISBN 978-3-030-11008-6. doi: 10.1007/978-3-030-11009-3˙42. URL https://doi.org/10.1007/978-3-030-11009-3_42.
- Population based augmentation: Efficient learning of augmentation policy schedules. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2731–2741. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.press/v97/ho19b.html.
- Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 746–753. IEEE, 2017.
- A review on genetic algorithm: past, present, and future. Multimedia Tools and Applications, 80:8091 – 8126, 2020.
- T. maintainers and contributors. Torchvision: Pytorch’s computer vision library. https://github.com/pytorch/vision, 2016.
- A. Mikołajczyk and M. Grochowski. Data augmentation for improving deep learning in image classification problem. In 2018 International Interdisciplinary PhD Workshop (IIPhDW), pages 117–122, 2018. doi: 10.1109/IIPHDW.2018.8388338.
- How useful is photo-realistic rendering for visual learning? In G. Hua and H. Jégou, editors, Computer Vision – ECCV 2016 Workshops, pages 202–217, Cham, 2016. Springer International Publishing. ISBN 978-3-319-49409-8.
- S. G. Müller and F. Hutter. Trivialaugment: Tuning-free yet state-of-the-art data augmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 774–782, October 2021.
- Pervasive label errors in test sets destabilize machine learning benchmarks. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1), 2021. URL https://openreview.net/forum?id=XccDXrDNLek.
- Learning to augment synthetic images for sim2real policy transfer. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2651–2657. IEEE, 2019.
- Syn2real: A new benchmark for synthetic-to-real visual domain adaptation, 2018.
- Faster r-cnn: Towards real-time object detection with region proposal networks. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015. URL https://proceedings.neurips.cc/paper_files/paper/2015/file/14bfa6bb14875e45bba028a21ed38046-Paper.pdf.
- Seeking similarities over differences: Similarity-based domain alignment for adaptive object detection. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9184–9193, 2021.
- Hypersim: A photorealistic synthetic dataset for holistic indoor scene understanding. In ICCV, 2021. URL https://arxiv.org/pdf/2011.02523.pdf.
- K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Y. Bengio and Y. LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1409.1556.
- C. Spearman. The proof and measurement of association between two things. The American Journal of Psychology, 15(1):72–101, 1904. ISSN 00029556. URL http://www.jstor.org/stable/1412159.
- Knowledge mining and transferring for domain adaptive object detection. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9113–9122, 2021. doi: 10.1109/ICCV48922.2021.00900.
- Domain randomization for transferring deep neural networks from simulation to the real world. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), page 23–30. IEEE Press, 2017. doi: 10.1109/IROS.2017.8202133. URL https://doi.org/10.1109/IROS.2017.8202133.
- Training deep networks with synthetic data: Bridging the reality gap by domain randomization. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 969–977, 2018.
- SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020. doi: 10.1038/s41592-019-0686-2.
- Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, volume 2, pages 1398–1402 Vol.2, 2003. doi: 10.1109/ACSSC.2003.1292216.
- Fake it till you make it: Face analysis in the wild using synthetic data alone. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 3681–3691, October 2021.
- Effective data augmentation with multi-domain learning gans. In AAAI Conference on Artificial Intelligence, 2019.
- Unsupervised synthetic image refinement via contrastive learning and consistent semantic-structural constraints. In Defense + Commercial Sensing, 2023a. URL https://api.semanticscholar.org/CorpusID:258309171.
- Masked retraining teacher-student framework for domain adaptive object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 19039–19049, October 2023b.
- Deformable DETR: deformable transformers for end-to-end object detection. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.