Mind the box: $l_1$-APGD for sparse adversarial attacks on image classifiers
Abstract: We show that when taking into account also the image domain $[0,1]d$, established $l_1$-projected gradient descent (PGD) attacks are suboptimal as they do not consider that the effective threat model is the intersection of the $l_1$-ball and $[0,1]d$. We study the expected sparsity of the steepest descent step for this effective threat model and show that the exact projection onto this set is computationally feasible and yields better performance. Moreover, we propose an adaptive form of PGD which is highly effective even with a small budget of iterations. Our resulting $l_1$-APGD is a strong white-box attack showing that prior works overestimated their $l_1$-robustness. Using $l_1$-APGD for adversarial training we get a robust classifier with SOTA $l_1$-robustness. Finally, we combine $l_1$-APGD and an adaptation of the Square Attack to $l_1$ into $l_1$-AutoAttack, an ensemble of attacks which reliably assesses adversarial robustness for the threat model of $l_1$-ball intersected with $[0,1]d$.
- Square attack: a query-efficient black-box adversarial attack via random search. In ECCV, 2020.
- Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In ICML, 2018.
- Adversarial robustness on in- and out-distribution improves explainability. In ECCV, 2020.
- Convex Optimization. Cambridge University Press, 2004.
- Accurate, reliable and fast robustness evaluation. In NeurIPS, 2019.
- Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy, 2017.
- Unlabeled data improves adversarial robustness. In NeurIPS, pp. 11190–11201. 2019.
- Ead: Elastic-net attacks to deep neural networks via adversarial examples. In AAAI, 2018.
- Condat, L. Fast projection onto the simplex and the l1subscript𝑙1l_{1}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ball. Mathematical Programming, 158:575–585, 2016.
- Minimally distorted adversarial examples with a fast adaptive boundary attack. In ICML, 2020a.
- Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In ICML, 2020b.
- Robustbench: a standardized adversarial robustness benchmark. arXiv preprint arXiv:2010.09670, 2020.
- Efficient projections onto the l1-ball for learning in high dimensions. In ICML, 2008.
- Robustness (python library), 2019. URL https://github.com/MadryLab/robustness.
- Generative adversarial nets. In NeurIPS, 2014.
- Uncovering the limits of adversarial training against norm-bounded adversarial examples. arXiv preprint arXiv:2010.03593v2, 2020.
- Hall, P. The distribution of means for samples of size n drawn from a population in which the variate takes values between 00 and 1111, all such values being equally probable. Biometrika, 19:240––245, 1927.
- Identity mappings in deep residual networks. In ECCV, 2016.
- Irwin, O. On the frequency distribution of the means of samples from a population having any law of frequency with finite moments. Biometrika, 19:225––239, 1927.
- Adversarial self-supervised contrastive learning. In NeurIPS, 2020.
- Cifar-10 (canadian institute for advanced research). URL http://www.cs.toronto.edu/~kriz/cifar.html.
- Adversarial examples in the physical world. In ICLR Workshop, 2017.
- Towards defending multiple adversarial perturbations via gated batch normalization. arXiv preprint arXiv:2012.01654v1, 2020.
- Learning to generate noise for multi-attack robustness, 2021. URL https://openreview.net/forum?id=tv8n52XbO4p.
- Towards deep learning models resistant to adversarial attacks. In ICLR, 2018.
- Adversarial robustness against the union of multiple perturbation models. In ICML, 2020.
- Sparsefool: a few pixels make a big difference. In CVPR, 2019.
- Foolbox: A python toolbox to benchmark the robustness of machine learning models. In ICML Reliable Machine Learning in the Wild Workshop, 2017.
- Overfitting in adversarially robust deep learning. In ICML, 2020.
- Adversarial library, 2020. URL https://github.com/jeromerony/adversarial-library.
- Augmented Lagrangian adversarial attacks. arXiv preprint arXiv:2011.11857, 2020.
- Towards the first adversarially robust neural network model on MNIST. In ICLR, 2019.
- Intriguing properties of neural networks. In ICLR, pp. 2503–2511, 2014.
- Adversarial training and robustness for multiple perturbations. In NeurIPS, 2019.
- Towards a unified min-max framework for adversarial exploration and robustness. arXiv preprint arXiv:1906.03563, 2019.
- Fast is better than free: Revisiting adversarial training. In ICLR, 2020.
- Do wider neural networks really help adversarial robustness? arXiv preprint arXiv:2010.01279v2, 2021.
- Enhancing adversarial defense by k-winners-take-all. In ICLR, 2020.
- Adversarial momentum-contrastive pre-training. arXiv preprint, arXiv:2012.13154v2, 2020.
- On the design of black-box adversarial examples by leveraging gradient-free optimization and operator splitting method. In ICCV, pp. 121–130, 2019.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.