Mitigating Adversarial Effects Through Randomization

Published 6 Nov 2017 in cs.CV | (1711.01991v3)

Abstract: Convolutional neural networks have demonstrated high accuracy on various tasks in recent years. However, they are extremely vulnerable to adversarial examples. For example, imperceptible perturbations added to clean images can cause convolutional neural networks to fail. In this paper, we propose to utilize randomization at inference time to mitigate adversarial effects. Specifically, we use two randomization operations: random resizing, which resizes the input images to a random size, and random padding, which pads zeros around the input images in a random manner. Extensive experiments demonstrate that the proposed randomization method is very effective at defending against both single-step and iterative attacks. Our method provides the following advantages: 1) no additional training or fine-tuning, 2) very few additional computations, 3) compatible with other adversarial defense methods. By combining the proposed randomization method with an adversarially trained model, it achieves a normalized score of 0.924 (ranked No.2 among 107 defense teams) in the NIPS 2017 adversarial examples defense challenge, which is far better than using adversarial training alone with a normalized score of 0.773 (ranked No.56). The code is public available at https://github.com/cihangxie/NIPS2017_adv_challenge_defense.

Abstract PDF Upgrade to Chat

Citations (997)

View on Semantic Scholar

Summary

The paper presents a novel randomization method at inference time that significantly improves CNN robustness against adversarial inputs.
The approach uses random resizing and padding to defend against attacks without additional training or significant performance loss on clean images.
Experimental results across various CNN architectures demonstrate substantial accuracy gains, validating the defense’s practical utility in adversarial settings.

Mitigating Adversarial Effects Through Randomization

The paper "Mitigating Adversarial Effects Through Randomization" addresses a significant vulnerability in Convolutional Neural Networks (CNNs): their susceptibility to adversarial examples. These adversarial examples, formed by adding subtle perturbations to input images, can cause CNNs to misclassify inputs with high confidence, posing serious security risks to machine learning systems. The paper proposes a novel method that incorporates randomization at inference time to enhance the robustness of CNNs against adversarial attacks.

Key Contributions

The proposed defense mechanism employs two randomization techniques during inference: random resizing and random padding. The primary contributions and advantages of this approach are as follows:

Effective Defense Without Retraining: The method does not require any additional training or fine-tuning of the CNNs, making it straightforward to implement.
Minimal Computational Overhead: The additional computations introduced by the randomization layers are minimal, ensuring that the run-time is not significantly affected.
Compatibility: The randomization layers are compatible with various network architectures and can be integrated with other adversarial defense techniques, such as adversarial training.

Experimental Results

The authors conducted extensive experiments to validate the effectiveness of their proposed method. They evaluated the defense mechanism using multiple CNN architectures (Inception-v3, ResNet-v2-101, Inception-ResNet-v2, and ens-adv-Inception-ResNet-v2) and against various attack methods, including both single-step (Fast Gradient Sign Method, FGSM) and iterative attacks (DeepFool, Carlini & Wagner, C&W).

Key Numerical Results

Vanilla Attack Scenario: In scenarios where the attacker is unaware of the randomization layers, the defense method significantly improved the top-1 classification accuracy for all networks and attacks. For instance, the accuracy of ens-adv-Inception-ResNet-v2 when attacked by FGSM with $\epsilon=10$ improved to 94.3% from 33.0% without defense.
Single-Pattern Attack Scenario: When the attackers are aware of the randomization but only target a single specific pattern, the defense still maintained high effectiveness, e.g., ens-adv-Inception-ResNet-v2 achieving 95.2% against C&W attacks.
Ensemble-Pattern Attack Scenario: Against stronger attacks where the attacker simulates multiple patterns, the defense mechanism with ens-adv-Inception-ResNet-v2 achieved top-1 accuracies of 93.5% for DeepFool and 86.1% for C&W.

Implications and Future Directions

The proposed randomization method showcases a notable improvement in the resilience of CNNs to adversarial attacks, especially iterative attacks, which tend to be more sophisticated. The minimal loss in performance on clean (non-adversarial) images is another significant merit, making the defense practical for real-world applications.

The integration of these methods with adversarial training highlights a promising approach for building more robust neural networks. Further exploration could involve combining randomization with other preprocessing techniques like random brightness, saturation, hue, and contrast adjustments, as the paper briefly touched upon the potential slight benefits of these combinations.

Conclusion

In summary, the utilization of randomization layers during inference provides a robust and efficient defense against adversarial attacks. The research contributes significantly to the field by demonstrating a practical and easily implementable solution that significantly enhances the security of CNN-based systems. The results from the NIPS 2017 adversarial examples defense challenge further underscore its effectiveness, achieving a normalized score of 0.924 and ranking second among 107 defense teams. The findings advocate for broader adoption and further development of randomization-based defenses in adversarial machine learning.