Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers
Abstract: Deep neural networks are applied in more and more areas of everyday life. However, they still lack essential abilities, such as robustly dealing with spatially transformed input signals. Approaches to mitigate this severe robustness issue are limited to two pathways: Either models are implicitly regularised by increased sample variability (data augmentation) or explicitly constrained by hard-coded inductive biases. The limiting factor of the former is the size of the data space, which renders sufficient sample coverage intractable. The latter is limited by the engineering effort required to develop such inductive biases for every possible scenario. Instead, we take inspiration from human behaviour, where percepts are modified by mental or physical actions during inference. We propose a novel technique to emulate such an inference process for neural nets. This is achieved by traversing a sparsified inverse transformation tree during inference using parallel energy-based evaluations. Our proposed inference algorithm, called Inverse Transformation Search (ITS), is model-agnostic and equips the model with zero-shot pseudo-invariance to spatially transformed inputs. We evaluated our method on several benchmark datasets, including a synthesised ImageNet test set. ITS outperforms the utilised baselines on all zero-shot test scenarios.
- Latent space oddity: on the curvature of deep generative models. Proceedings of International Conference on Learning Representations, 10 2017.
- The effects of regularization and data augmentation are class dependent. NeurIPS, 2022a.
- A data-augmentation is worth a thousand samples: Analytical moments and sampling-free training. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A. (eds.), Advances in Neural Information Processing Systems, volume 35, pp. 19631–19644. Curran Associates, Inc., 2022b.
- Equivariant neural networks and equivarification. 2020.
- Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35:1798–1828, 2012.
- Borodin, A. N. Stochastic Processes. Birkhäuser Cham, 2017. ISBN 978-3-319-62310-8.
- Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. ArXiv, 10.48550/ARXIV.2104.13478, 2021.
- Optimizing relevance maps of vision transformers improves robustness. In Thirty-Sixth Conference on Neural Information Processing Systems, 2022.
- Invariance reduces variance: Understanding data augmentation in deep learning and beyond. CoRR, abs/1907.10905, 2019.
- Rotdcf: Decomposition of convolutional filters for rotation-equivariant deep networks. ArXiv, abs/1805.06846, 2018.
- Group equivariant convolutional networks. In Balcan, M. F. and Weinberger, K. Q. (eds.), Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pp. 2990–2999, New York, New York, USA, 20–22 Jun 2016. PMLR.
- A kernel theory of modern data augmentation. In Chaudhuri, K. and Salakhutdinov, R. (eds.), Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp. 1528–1537. PMLR, 09–15 Jun 2019.
- Achieving rotational invariance with bessel-convolutional neural networks. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, volume 34, pp. 28772–28783. Curran Associates, Inc., 2021.
- Deep diffeomorphic transformer networks. Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- On robustness and transferability of convolutional neural networks. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16453–16463, 2021. doi: 10.1109/CVPR46437.2021.01619.
- An image is worth 16x16 words: Transformers for image recognition at scale. ICLR, 2021.
- Your classifier is secretly an energy based model and you should treat it like one. In ICLR, 2020.
- A rotation-equivariant convolutional neural network model of primary visual cortex. In Seventh International Conference on Learning Representations (ICLR), pp. 1–11, 2019.
- Polar transformer networks. 2018. Publisher Copyright: © Learning Representations, ICLR 2018 - Conference Track Proceedings.All right reserved.; 6th International Conference on Learning Representations, ICLR 2018 ; Conference date: 30-04-2018 Through 03-05-2018.
- Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML’16, pp. 1050–1059. JMLR.org, 2016.
- Bias-reduced uncertainty estimation for deep neural classifiers. In International Conference on Learning Representations (ICLR), 2018.
- Deep symmetry networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’14, pp. 2537–2545, Cambridge, MA, USA, 2014. MIT Press.
- On evaluating adversarial robustness of chest x-ray classification: Pitfalls and best practices, 2022.
- Measuring invariances in deep networks. In Proceedings of the 22nd International Conference on Neural Information Processing Systems, NIPS’09, pp. 646–654, Red Hook, NY, USA, 2009. Curran Associates Inc. ISBN 9781615679119.
- Graf, M. Coordinate transformations in object recognition. Psychological bulletin, 132:920–45, 11 2006. doi: 10.1037/0033-2909.132.6.920.
- Memory for objects in canonical and noncanonical viewpoints. Psychonomic bulletin and review, 15:940–4, 11 2008. doi: 10.3758/PBR.15.5.940.
- Hall, B. Lie Groups, Lie Algebras, and Representations: An Elementary Introduction. Graduate Texts in Mathematics. Springer International Publishing, 2015. ISBN 9783319134673.
- Object Orientation Agnosia: A Failure to Find the Axis? Journal of Cognitive Neuroscience, 13(6):800–812, 08 2001. ISSN 0898-929X. doi: 10.1162/08989290152541467.
- Gaussian error linear units (gelus), 2023.
- Data augmentation instead of explicit regularization, 2018.
- Exploring weight symmetry in deep neural networks. Computer Vision and Image Understanding, 187:102786, 2019. ISSN 1077-3142. doi: https://doi.org/10.1016/j.cviu.2019.07.006.
- Dair: Data augmented invariant regularization. ArXiv, abs/2110.11205, 2021.
- Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, ICML’15, pp. 448–456. JMLR.org, 2015.
- Spatial transformer networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2, NIPS’15, pp. 2017–2025, Cambridge, MA, USA, 2015. MIT Press.
- Equivariance with learned canonicalization functions. In Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., and Scarlett, J. (eds.), Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pp. 15546–15566. PMLR, 23–29 Jul 2023.
- Canonical visual size for real-world objects. Journal of experimental psychology. Human perception and performance, 37:23–37, 02 2011. doi: 10.1037/a0020413.
- Nonparametric uncertainty quantification for single deterministic neural network. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A. (eds.), Advances in Neural Information Processing Systems, volume 35, pp. 36308–36323. Curran Associates, Inc., 2022.
- Simple and scalable predictive uncertainty estimation using deep ensembles. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. doi: 10.1109/5.726791.
- A tutorial on energy-based learning. 2006.
- A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, pp. 7167–7177, Red Hook, NY, USA, 2018. Curran Associates Inc.
- Learning to discount transformations as the computational goal of visual cortex. 06 2011. doi: 10.1038/npre.2011.6078.1.
- Implicit rugosity regularization via data augmentation, 2019.
- Visualizing the loss landscape of neural nets. In Neural Information Processing Systems, 2018.
- Inverse compositional spatial transformer networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2252–2260, 2017. doi: 10.1109/CVPR.2017.242.
- Energy-based out-of-distribution detection. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems, volume 33, pp. 21464–21475. Curran Associates, Inc., 2020.
- Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR), 2017.
- Rotation equivariant vector field networks. In 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, oct 2017. doi: 10.1109/iccv.2017.540.
- Confidence-aware learning for deep neural networks. In III, H. D. and Singh, A. (eds.), Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pp. 7034–7044. PMLR, 13–18 Jul 2020.
- Nash, D. A Friendly Introduction to Group Theory. CreateSpace Independent Publishing Platform, 2016. ISBN 9781517100452.
- Naturally occurring equivariance in neural networks. Distill, 2020. doi: 10.23915/distill.00024.004. https://distill.pub/2020/circuits/equivariance.
- Practical Deep Learning with Bayesian Principles. Curran Associates Inc., Red Hook, NY, USA, 2019.
- Osband, I. Risk versus uncertainty in deep learning: Bayes, bootstrap and the dangers of dropout. Workshop on Bayesian Deep Learning, NIPS, 2016.
- Canonical perspective and the perception of objects. J. Long and A. Baddeley (Eds.), Attention and performance IX. Hillsdale, NJ: Erlbaum, 1981.
- DCFNet: Deep neural network with decomposed convolutional filters. International Conference on Machine Learning, 2018.
- ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015. doi: 10.1007/s11263-015-0816-y.
- Selective hypothesis testing. Psychonomic Bulletin & Review, 5:197–220, 1998.
- Learning continuous rotation canonicalization with radial beam sampling, 2022.
- Structuring representations using group invariants. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A. (eds.), Advances in Neural Information Processing Systems, volume 35, pp. 34162–34174. Curran Associates, Inc., 2022.
- A survey on image data augmentation for deep learning. Journal of Big Data, 6:1–48, 2019.
- A refined spatial transformer network. In International Conference on Neural Information Processing, 2018.
- Prototypical networks for few-shot learning. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
- Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks, (0):–, 2012. ISSN 0893-6080. doi: 10.1016/j.neunet.2012.02.016.
- Equivariant Transformer Networks. In International Conference on Machine Learning, 2019.
- Rotation-invariant clustering of neuronal responses in primary visual cortex. In International Conference on Learning Representations, 2020.
- Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. ArXiv, abs/1708.07747, 2017.
- Meta-learning symmetries by reparameterization. ICLR, 2020.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.