Papers
Topics
Authors
Recent
Search
2000 character limit reached

Deep Learning-Based Attacks

Updated 4 February 2026
  • Deep learning-based attacks are techniques that exploit neural network vulnerabilities using adversarial examples, poisoning, and side-channel approaches.
  • Methodologies span gradient-based, black-box, and physical attacks, achieving high success rates and demonstrating transferability across various domains.
  • Defense strategies such as adversarial training, certified defenses, and input transformations are evolving to counter these complex threats.

Deep learning-based attacks are a diverse class of offensive techniques that exploit or subvert systems deploying deep neural networks by leveraging their characteristic vulnerabilities, specifically their sensitivity to carefully crafted or maliciously modified inputs, models, or training pipelines. These attacks have evolved rapidly, targeting a wide range of applications including computer vision, natural language processing, wireless communication, quantum cryptography, recommender systems, mobile apps, and physical security domains. The methodologies span both inference-time (evasion) and training-time (poisoning, backdoor injection) vectors, with attack power scaling from white-box access (full model parameters and gradients) to highly restricted black-box or even decision-only query scenarios. Below, we delineate the principal categories, mechanisms, representative results, and current research directions for deep learning-based attacks.

1. Taxonomy of Deep Learning-Based Attacks

Deep learning-based attacks are classified by their target, attacker capability, and attack phase:

This typology underscores the cross-domain applicability and adaptive technical scope of modern deep-learning-based attacks.

2. Methodologies and Attack Mechanisms

Gradient-Based Adversarial Attacks use knowledge of the model’s loss landscape to maximize misclassification with minimal perturbation. Key algorithms:

  • Fast Gradient Sign Method (FGSM): x=x+ϵsign(xL(θ,x,y))x' = x + \epsilon \cdot \mathrm{sign}(\nabla_x L(\theta, x, y))
  • Projected Gradient Descent (PGD): Iteratively applies FGSM-style updates with projection onto an p\ell_p-ball.
  • Carlini-Wagner (CW) Attack: Solves minδδp+cg(x+δ)\min_\delta \|\delta\|_p + c \cdot g(x+\delta), tightly controlling distortion (Nguyen et al., 2017, Wang et al., 2024).

Black-Box and Decision-Based Attacks:

  • Surrogate/transfer-model approach: Attackers query the target model to build a substitute, then transfer white-box perturbations (Cao et al., 2021, Deng et al., 2022, Sadeghi et al., 2018).
  • Decision-based, sparse perturbations: Only top-1 label output is needed. Evolutionary or combinatorial optimization finds minimal 0\ell_0-norm attacks (e.g., SparseEvo) (Vo et al., 2022).

Physical and Protocol-Level Attacks:

  • Physically realizable attacks: Attacks are implemented in the physical world, e.g., via projected light patterns (NetFlick adversarial flicker) (Chang et al., 2023).
  • Side-channel and RF attacks: Injecting perturbations into analog waveforms to manipulate deep classifiers directly at the signal level (Ma et al., 12 Dec 2025, Sadeghi et al., 2018, Luo et al., 2021).
  • Quantum cryptography: Deep RNNs process measurement records from continuous quantum measurements, inferring secret keys with high accuracy at minimal protocol disturbance (Lejeune et al., 2024).

Interpretability/Explanation Attacks:

  • Jointly optimizing for prediction error and preservation of explanation consistency, deceiving both classifier and interpretation maps (AdvEdge/AdvEdge+^{+}) (Abdukhamidov et al., 2022).

Poisoning and Model-Data Attacks:

  • Optimization-based poisoning manipulates the loss landscape to maximize downstream item promotion or backdoor success (e.g., NeuMF recommender poisoning, backdoor weight patching) (Huang et al., 2021, Costales et al., 2020).
  • Live Trojan attacks directly patch DNN weights at runtime, achieving targeted behavior with minimal detectable modification (Costales et al., 2020).

3. Application Domains and Practical Demonstrations

Deep learning-based attacks have been empirically validated in the following domains:

  • Computer Vision (CV): Image classification, object recognition, and video compression. Digital and physical attacks can cause significant accuracy and quality degradation, even under operational constraints (e.g., PSNR drops, high attack success rates) (Cao et al., 2021, Chang et al., 2023, Vo et al., 2022).
  • Wireless and IoT Security: RSSI-based authentication (RFFI), modulation classification, DNS/NIDS. Universal and per-sample perturbations achieve >95% misclassification with extremely low perturbation energy—far exceeding classical jamming in effectiveness (Ma et al., 12 Dec 2025, Sadeghi et al., 2018, Mathews et al., 2022).
  • Autonomous Driving: Physical attacks on traffic signs, LiDAR, radar; cyberattacks on OTA updates. End-to-end attacks result in large steering deviations and nearly perfect targeted misclassification in real scenarios (Deng et al., 2021).
  • Mobile Apps and Embedded Models: Extraction and attack of TFLite/PyTorch models on Android, using grey-box and semantic black-box strategies. Employing transfer-learning surrogates, over 71% of real-world apps were found vulnerable to practical attacks (Huang et al., 2022, Deng et al., 2022).
  • Quantum Cryptography: Deep RNN-based side-channel attacks on BB84 QKD protocols, achieving 86.1% key inference accuracy with marginal QBER increase (Lejeune et al., 2024).
  • Privacy and Explainability: Recovery of sensitive content (e.g., through envelopes) and deception of interpretation systems, illuminating new classes of threat (Huang et al., 2020, Abdukhamidov et al., 2022).

4. Impact Metrics and Quantitative Results

Attack efficiency and severity are measured by:

  • Attack Success Rate (ASR): Proportion of inputs successfully misclassified, often exceeding 90% in white-box settings or with strong transferability [66.6%\sim66.6\% avg. for black-box app attacks (Cao et al., 2021); >80%>80\% for UAP on RF, >95%>95\% for physical video attacks (Ma et al., 12 Dec 2025, Chang et al., 2023)].
  • Perturbation Norms: 2\ell_2, \ell_\infty, 0\ell_0 metrics, typically bounded by imperceptibility thresholds (e.g., ϵ=8/255\epsilon=8/255 for image, <30<-30dBforRF).</li><li><strong>ResourceCost</strong>:Numberofqueriesinblackboxattacks( for RF).</li> <li><strong>Resource Cost</strong>: Number of queries in black-box attacks (\lesssim 10^4forlabelonlysparseattacks(<ahref="/papers/2202.00091"title=""rel="nofollow"dataturbo="false"class="assistantlink"xdataxtooltip.raw="">Voetal.,2022</a>);8002000forsubstitutemodels(<ahref="/papers/1911.12562"title=""rel="nofollow"dataturbo="false"class="assistantlink"xdataxtooltip.raw="">Heetal.,2019</a>)).</li><li><strong>SystemEffects</strong>:DropinPSNR/bitrate(video),steereddeviation(autonomousdriving),QBER(quantum),HitRatioinrecommenders,detectorevasionrates.</li><li><strong>Transferability</strong>:Universalperturbationscraftedononearchitectureapplyeffectivelytoothersofsimilarordifferenttype(CNN for label-only sparse attacks (<a href="/papers/2202.00091" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Vo et al., 2022</a>); 800–2000 for substitute models (<a href="/papers/1911.12562" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">He et al., 2019</a>)).</li> <li><strong>System Effects</strong>: Drop in PSNR/bitrate (video), steered deviation (autonomous driving), QBER (quantum), Hit Ratio in recommenders, detector evasion rates.</li> <li><strong>Transferability</strong>: Universal perturbations crafted on one architecture apply effectively to others of similar or different type (CNN\rightarrowRNN,pretrainedRNN, pre-trained\rightarrowfinetuned)(<ahref="/papers/2512.12002"title=""rel="nofollow"dataturbo="false"class="assistantlink"xdataxtooltip.raw="">Maetal.,12Dec2025</a>,<ahref="/papers/2204.11075"title=""rel="nofollow"dataturbo="false"class="assistantlink"xdataxtooltip.raw="">Huangetal.,2022</a>).</li></ul><divclass=overflowxautomaxwfullmy4><tableclass=tablebordercollapsewfullstyle=tablelayout:fixed><thead><tr><th>Domain</th><th>AttackSuccessRate</th><th>PerturbationBudget</th></tr></thead><tbody><tr><td>AndroidApps</td><td>66.6<td>fine-tuned) (<a href="/papers/2512.12002" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Ma et al., 12 Dec 2025</a>, <a href="/papers/2204.11075" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Huang et al., 2022</a>).</li> </ul> <div class='overflow-x-auto max-w-full my-4'><table class='table border-collapse w-full' style='table-layout: fixed'><thead><tr> <th>Domain</th> <th>Attack Success Rate</th> <th>Perturbation Budget</th> </tr> </thead><tbody><tr> <td>Android Apps</td> <td>66.6% (black-box)</td> <td>\epsilon=8/25520/255</td></tr><tr><td>RFFI</td><td>9598<td></td> </tr> <tr> <td>RFFI</td> <td>95–98% (white-box); 81.7% (universal, no prior)</td> <td><30dB<ahref="https://www.emergentmind.com/topics/parametershiftrulepsr"title=""rel="nofollow"dataturbo="false"class="assistantlink"xdataxtooltip.raw="">PSR</a></td></tr><tr><td>VideoCompression</td><td>9298<td>–30 dB <a href="https://www.emergentmind.com/topics/parameter-shift-rule-psr" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">PSR</a></td> </tr> <tr> <td>Video Compression</td> <td>92–98% (offline); 83–86% (universal, online)</td> <td>\epsilon=0.2</td></tr><tr><td>QuantumQKD</td><td>Keyinference86.1<td></td> </tr> <tr> <td>Quantum QKD</td> <td>Key inference 86.1%</td> <td>\sim$2.6% QBER penalty CV Black-box 47.35% apps broken Adaptive

These metrics demonstrate that deep learning-based attacks remain potent even under severe information restrictions, minimal access, and strong resource constraints.

5. Defense Strategies and Mitigation

Countermeasures span both proactive and reactive strategies:

No single defense eliminates all deep learning-based attacks, and most effective approaches combine multiple mechanisms matched to the threat model and operational environment.

  • Transferability and black-box potency: Modern attacks exploit transferability, universal perturbations, and proxy models to bypass limited access restrictions in practical deployments (Cao et al., 2021, Huang et al., 2022, Ma et al., 12 Dec 2025).
  • Real-World Deployment Gaps: Standard academic models are less indicative of real-system vulnerability; practical attacks require adaptation to quantization, hidden I/O, and proprietary frameworks (Deng et al., 2022).
  • Physical-World/Protocol Attacks: Empirical evidence highlights that attacks can be realized under real environmental constraints (lighting, air transmission, device synchronization) (Chang et al., 2023, Ma et al., 12 Dec 2025, Luo et al., 2021).
  • Hybrid, multi-layer, and explanation attacks: Emerging research targets the interpretability stack, federated learning, and automation of defense strategies, with questions about the robustness–accuracy trade-off and interpretability under adversarial manipulation (Abdukhamidov et al., 2022, Wang et al., 2024).
  • Automated and scalable defenses: The field is moving toward integrated, automated security architectures, zero-trust methodologies, and formal certification of deep models, especially as models and applications scale (Wang et al., 2024, Deng et al., 2021).

A persistent research focus remains on quantifying human-perceptual imperceptibility, lowering the cost of robust training and detection, and securing models across both digital and physical threat surfaces.

7. Representative Case Studies

  • Black-box Transfer Attacks on Mobile Apps: Substitute models trained via API logging and public datasets yield an average ASR of 66.6% on diverse, real-world apps, markedly exceeding prior methods (Cao et al., 2021).
  • Universal, Decision-based Sparse Attacks: The SparseEvo algorithm achieves 99% untargeted ASR on ImageNet within 5,000 queries, with highly sparse perturbations (0.08%\sim0.08\% pixel change), demonstrating that even decision-only black-box access is not sufficient defense (Vo et al., 2022).
  • Practical Physical Attacks on Video Compression: NetFlick adversarial flicker attacks achieve 92-98% ASR digitally and up to 86% with universal, online physical perturbations, drastically degrading PSNR and bitrates (Chang et al., 2023).
  • Trojan Injection by Live Weight Patching: Trojaned DNNs can be realized in-memory at runtime, with minimal patch size and high trigger stealthiness, show casing feasibility for on-device and cloud deployments (Costales et al., 2020).
  • Attack on Quantum Key Distribution: A deep RNN-based continuous measurement scheme allows a spy to infer 86%\sim86\% of the sifted key in BB84 QKD, for only a 2.6%-point QBER penalty, rivaling the optimal quantum cloner (Lejeune et al., 2024).

These examples underscore both the efficacy and the adaptability of deep learning-based attacks across technical domains.


References:

(Costales et al., 2020, Cao et al., 2021, Vo et al., 2022, Chang et al., 2023, Sadeghi et al., 2018, Lejeune et al., 2024, Mathews et al., 2022, Huang et al., 2020, Abdukhamidov et al., 2022, Ma et al., 12 Dec 2025, Luo et al., 2021, Deng et al., 2021, Nguyen et al., 2017, Huang et al., 2021, Wang et al., 2024, Deng et al., 2022, Huang et al., 2022, He et al., 2019).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Deep Learning-Based Attacks.