Targeted Feature-Dependent Noise
- Targeted feature-dependent noise is systematic corruption driven by spatial, semantic, and learned feature activations in machine learning datasets.
- It challenges standard models across supervised learning, image restoration, reinforcement learning, and adversarial robustness by violating uniform noise assumptions.
- Empirical and theoretical analyses reveal that specialized architectures and noise-aware methods are essential for mitigating feature-dependent noise impacts.
Targeted feature-dependent noise refers to the systematic introduction or occurrence of noise in samples whose corruption probability depends explicitly on specific feature representations, spatial locations, semantic attributes, or learned model activations. This concept has emerged as a central challenge in supervised learning, image restoration, adversarial robustness, reinforcement learning from preferences, and other settings where the assumption of independent or class-conditional noise fails to capture the complexity of real-world data corruptions. Recent research demonstrates that targeted, feature-dependent noise severely undermines the effectiveness of methods designed for uniform or class-conditional noise and requires domain-adaptive architecture modifications, specialized learning objectives, and rethinking foundational theoretical guarantees.
1. Formal Models of Feature-Dependent Noise
Feature-dependent noise generalizes the classical label noise and data degradation models by allowing the corruption rate and pattern to vary as a function of the input’s feature vector , extracted representation , or latent trajectory in RL. In the supervised setting, label noise is governed by transition kernels (Tjandra et al., 2023), per-class/instance flip rates (Im et al., 2023), or transition matrices (Cheng et al., 2020). For preference-based RL, the observed preference labels for a trajectory pair are flipped with probability , where is a function of trajectory feature similarity, magnitude, model uncertainty, or hybrid behavioral/uncertainty scores (Li et al., 5 Jan 2026).
In image restoration and denoising, targeted noise denotes spatially or spectrally localized degradation affecting features , often detected by frequency-domain filters and classified at feature map scale (Wang et al., 18 Sep 2025). Generative approaches synthesize feature-dependent noise by conditioning diffusion models or feature-wise affine modulations on exemplar noisy-clean pairs (Kim et al., 4 Dec 2025).
2. Theoretical Guarantees and Learning Bounds
Theoretical analysis of feature-dependent noise departs from the classical setting where noise is assumed i.i.d. or class-conditional (Oyen et al., 2022). Under instance- and label-dependent noise, sharp excess risk bounds reveal an irreducible term scaling as the average corruption rate: for binary classification, empirical risk minimization achieves excess risk , where is the uniform bound on (Im et al., 2023). Minimax lower bounds show no estimator can drive excess $0$-$1$ risk below without clean samples or strong assumptions, even if anchor points or margin conditions are imposed (Im et al., 2023, Tjandra et al., 2023).
For progressive label correction algorithms under Polynomial-Margin Diminishing (PMD) noise, convergence to the Bayes optimal classifier occurs in pure regions of the feature space, provided flip probabilities decay away from the decision boundary and regularity assumptions on the hypothesis class and feature density hold (Zhang et al., 2021). In preference-based RL, feature-dependent noise disrupts denoising and learning even at low noise fractions, especially when the noise overlaps regions of high model uncertainty or semantic similarity (Li et al., 5 Jan 2026).
3. Algorithmic Frameworks and Architectural Designs
Approaches to handling feature-dependent noise fall into several families, each employing distinct strategies to cope with the challenges of targeted corruption:
- Sample sieve and confidence regularization: CORES uses confidence-regularized loss and dynamic sample sieving to separate clean and noisy examples without explicit estimation of , achieving high precision in corrupt example filtering and provable robustness to arbitrary instance-dependent noise (Cheng et al., 2020).
- Anchor point alignment sets: A two-stage process leverages a small set of anchor points (alignment set) with known true/noisy pairs to model noise confidence and subgroup-dependent clean rates, enabling robust performance and fairness in the presence of systematic, instance-dependent noise (Tjandra et al., 2023).
- Progressive label correction: Iteratively refines labels by trusting only high-confidence predictions in pure regions, gradually incorporating more samples as the confidence margin shrinks, and guaranteeing Bayes consistency under PMD noise patterns (Zhang et al., 2021).
- Guided feature modulation for generative noise synthesis: GuidNoise implements guidance-aware affine feature modification (GAFM) with a noisy-clean reference pair, enabling feature-level control of synthetic noise via modulation at each layer during sampling and training (Kim et al., 4 Dec 2025).
- Feature denoising for restoration and SR: Targeted feature denoising (TFD) integrates frequency-domain noise detection, spatial/frequency attention-based denoising modules, and gating so only contaminated feature slices are processed, yielding substantial PSNR/LPIPS improvements under real and synthetic noise (Wang et al., 18 Sep 2025).
- Feature-space mixup for adversarial robustness: Clean Feature Mixup and Feature Tuning Mixup inject random and optimizable perturbations into deep feature representations, simulating adversarial and friendly competitors to bridge decision boundaries and enhance attack transferability (Byun et al., 2023, Liang et al., 2024).
4. Empirical Phenomena and Experimental Findings
Targeted feature-dependent noise exhibits distinct empirical behavior compared to random or class-conditional noise:
- Learning dynamics: Noisy samples affected by feature-dependent noise are often memorized as easily as clean samples, violating assumptions that noisy examples yield high early loss. Label recall curves for pseudo noisy datasets show synchronized learning of clean and noisy data, while randomized noise leads to memorization collapse (Kamabattula et al., 2021).
- Hardness in RL and restoration: In PbRL, performance under feature-dependent noise can deteriorate more sharply than under uniform noise at equal corruption rates, with uncertainty-aware and hybrid noise models producing dramatic drops in episodic return (Table 1 and Table 6 in (Li et al., 5 Jan 2026)). In restoration/SR, models overfit noise and require feature-localized denoising for generalizability (Wang et al., 18 Sep 2025).
- Robustness limitations: Most robust learning methods tuned for random flips (loss correction, robust losses, early stopping) fail to generalize under feature-dependent noise; sample selection and per-feature monitoring provide partial remedy (Kamabattula et al., 2021, Cheng et al., 2020).
- Adversarial settings: Simulating feature-space competition via mixup approaches increases attack transferability (e.g. CFM raising targeted success to 74.6% vs. 49.4% baselines (Byun et al., 2023), and FTM/FTM-E further improving success to 77.4/79.5% (Liang et al., 2024)).
5. Domains of Application
Feature-dependent noise arises across multiple domains:
- Supervised learning with human-annotated datasets: Instance bias, systematic subgroup mislabeling, and task difficulty drive feature-related corruption in medical, vision, and text applications (Tjandra et al., 2023, Kamabattula et al., 2021).
- Image restoration and super-resolution: Overfitting to noise rather than blur or JEPG artifacts requires targeted detection and denoising at intermediate feature stages (Wang et al., 18 Sep 2025).
- Adversarial and transfer attacks: Feature mixup and tuning methods explicitly craft adversarial perturbations conditioned on underlying model features rather than raw pixel-space (Byun et al., 2023, Liang et al., 2024).
- Preference-based RL: Teacher and LLM noise frequently depend on trajectory features, model uncertainty, and visual similarity, producing failure modes unique to FDN (Li et al., 5 Jan 2026).
6. Limitations and Future Directions
Current methods for uniform or class-dependent noise do not yield satisfactory performance under feature-dependent noise. Key open problems and research directions include:
- Structure modeling: Developing models that accurately capture the mapping between input features and noise probabilities , possibly via auxiliary networks or expectation-maximization over noise latents (Tjandra et al., 2023, Li et al., 5 Jan 2026).
- Feature-aware denoising and filtering: Using nearest-neighbour, contrastive, or clustering approaches in feature space to detect correlated regions of high-noise and adapt sample weights (Li et al., 5 Jan 2026, Cheng et al., 2020).
- Active and robust querying: Avoiding ambiguous regions in RL and active learning, and incorporating psychological priors to simulate realistic teacher biases (Li et al., 5 Jan 2026).
- Generalizable architectures: Modular integration of noise detection/denoising (TFD), guidance modulation (GuidNoise), or competitive mixup (CFM, FTM) as plug-in components for models in vision, reinforcement learning, and NLP (Kim et al., 4 Dec 2025, Wang et al., 18 Sep 2025).
- Empirical protocols: Datasets with reproducible, ground-truth-equipped feature-dependent noise (e.g., pseudo noisy frameworks) are essential for benchmarking future robust algorithms (Kamabattula et al., 2021).
7. Summary Table: Key Properties Across Domains
| Setting | Noise dependency model | Key algorithms/components | Impact of targeted FDN |
|---|---|---|---|
| Supervised learning (classification) | , | Sample sieve, alignment set, PLC | Irreducible excess risk; fairness issues |
| Image restoration/SR | Feature map ; spatial/frequency | TFD module, GuidNoise, ND gating | Overfitting to noise; content preservation |
| Adversarial transfer attacks | Deep features | CFM, FTM, mixup in feature space | Boosted transferability rates |
| Preference-based RL | Trajectory features | RIME denoising, feature-aware filtering | Algorithm collapse at low corruption rates |
Empirical and theoretical advances reveal that targeted feature-dependent noise constitutes a fundamentally harder problem, reshaping assumptions about robustness, sample selection, and generalization in modern machine learning. Progress in this area will depend on feature-conditioned modeling, domain-adaptive correction frameworks, and evolving empirical protocols for real-world noise analysis.