FaultDiffusion: Few-Shot Fault Generation

Updated 26 December 2025

FaultDiffusion is a generative modeling framework that synthesizes fault time-series by leveraging abundant normal data and a few fault examples.
It uses a two-stage training process with a diffusion backbone pre-trained on normal data followed by fine-tuning a positive–negative difference adapter for fault adaptation.
The framework incorporates a diversity regularizer to prevent mode collapse, achieving strong, benchmarked performance on industrial fault datasets.

FaultDiffusion is a generative modeling framework for few-shot fault time-series generation developed for scenarios with abundant normal (healthy) multivariate sensor data and scarce annotated fault (anomalous) data. Addressing the challenge of generating diverse, realistic synthetic fault samples despite limited faulty examples, FaultDiffusion leverages a diffusion model backbone pre-trained on normal series, then fine-tunes a lightweight positive-negative difference adapter on a small set of fault traces while preventing mode collapse with a novel diversity regularizer. The approach is distinguished by its two-stage training, architectural innovations for domain adaptation, and strong empirical performance across industrial benchmark datasets (Xu et al., 19 Nov 2025).

1. Problem Formulation and Motivating Context

The fundamental problem addressed by FaultDiffusion is the reliable synthesis of fault time-series under an extreme data imbalance regime, typical in industrial equipment monitoring. The multivariate time-series is denoted $X_{1:\tau} = (x_1, ..., x_\tau) \in \mathbb{R}^{\tau \times d}$ , with large normal dataset $\mathcal{D}_N = \{ X^n_{1:\tau,i} \}_{i=1}^{N_n}$ and a small $K$ -shot fault set $\mathcal{D}_F = \{ X^f_{1:\tau,j} \}_{j=1}^K$ where $K \ll N_n$ . The objective is to train a generator $G$ such that its output distribution $\hat{p}_f(x)$ closely matches the true fault distribution $p_f(x)$ . This is formalized as

$p_f(x) = p_n(x) + \Delta_\theta(x),$

where $p_n$ is the normal data distribution and $\Delta_\theta(x)$ is a learned correction capturing the domain shift (Xu et al., 19 Nov 2025).

2. Diffusion Model Backbone

FaultDiffusion builds on the Denoising Diffusion Probabilistic Model (DDPM) framework, which comprises a forward noising process and a learned reverse process. The forward process evolves as

$q(x_t | x_{t-1}) = \mathcal{N}(x_t; \sqrt{1 - \beta_t} x_{t-1}, \beta_t I),$

with $t = 1,\ldots,T$ and variance schedule $\{ \beta_t \}$ . The joint noising process is

$q(x_t | x_0) = \mathcal{N}(x_t; \sqrt{\bar\alpha_t} x_0, (1-\bar\alpha_t) I),$

where $\bar\alpha_t = \prod_{s=1}^t(1 - \beta_s)$ . The reverse process is parameterized as

$p_\theta(x_{t-1} | x_t) = \mathcal{N}(x_{t-1}; \mu_\theta(x_t, t), \Sigma_\theta(x_t, t)).$

Training proceeds by minimizing a noise prediction loss derived from the ELBO:

$\mathcal{L}_{\text{diff}} = \mathbb{E}_{t, x_0, \epsilon} \bigl[ \| \epsilon - \epsilon_\theta(x_t, t) \|^2_2 \bigr],$

where $x_t = \sqrt{\bar\alpha_t} x_0 + \sqrt{1-\bar\alpha_t}\epsilon, \epsilon \sim \mathcal{N}(0, I)$ (Xu et al., 19 Nov 2025).

3. Positive–Negative Difference Adapter

To enable few-shot adaptation from normal to fault domains, FaultDiffusion introduces a positive–negative difference adapter. During fine-tuning (on $\mathcal{D}_F$ ), only the adapter’s parameters are updated while the diffusion backbone remains frozen. At each network layer $t$ , the adapter receives as input the backbone hidden activations $h^{(t)}_{\mathrm{back}}$ and an accumulation of prior adapter outputs:

$h^{(t)}_{\mathrm{in}} = h^{(t)}_{\mathrm{back}} + \sum_{k=1}^{t-1} h^{(k)}_{\mathrm{loc}}.$

The adapter mechanism employs sliding-window multi-head attention, conferring sensitivity to local temporal anomalies. Its output $h^{(t)}_{\mathrm{loc}}$ is fused by a residual connection:

$h^{(t+1)}_{\mathrm{back}} = h^{(t)}_{\mathrm{back}} + \alpha h^{(t)}_{\mathrm{loc}},$

where $\alpha$ is a learnable scaling factor. This enables effective modeling of $\Delta_\theta(x) \approx p_f(x) - p_n(x)$ , conditioning the generation process on learned fault signatures (Xu et al., 19 Nov 2025).

4. Diversity Regularization and Training Procedure

To prevent mode collapse in the low-data regime, FaultDiffusion utilizes an inter-sample diversity loss:

$\mathcal{L}_{\mathrm{div}} = \mathbb{E} \left[ \| s_1 - s_2 \|^2_2 \right],$

where $s_1, s_2$ are independent noise prediction samples from the network for the same input. The final fine-tuning loss is

$\mathcal{L} = \underbrace{\mathcal{L}_{\mathrm{diff}}}_{\text{denoising}} + \lambda_{\mathrm{adv}} \underbrace{\mathcal{L}_{\mathrm{adapter}}}_{\substack{\text{(optional) direct} \ \text{supervision of } \Delta}} + \lambda_{\mathrm{div}} \underbrace{\mathcal{L}_{\mathrm{div}}}_{\text{diversity}},$

with $\mathcal{L}_{\mathrm{adapter}}$ optionally enforcing an $\ell_2$ penalty on adapter output and $\lambda_{\mathrm{adv}}, \lambda_{\mathrm{div}}$ balancing the terms.

The algorithm consists of:

Fine-tuning: sampling mini-batches from $\mathcal{D}_F$ , adding noise, passing through the frozen backbone and trainable adapter, evaluating losses, and updating adapter parameters.
Sampling: iterative denoising from $x_T \sim \mathcal{N}(0,I)$ , with the adapter correcting backbone outputs at each step (Xu et al., 19 Nov 2025).

5. Experimental Protocols and Evaluation Metrics

FaultDiffusion is evaluated on a custom industrial dataset (15 fault types), the Tennessee Eastman Process (TEP; 6 faults), and DAMADICS (4 valve-fault types), following a strict few-shot protocol with $K \in \{1, 5\}$ . Baselines include TimeGAN, TimeVAE, Cot-GAN, and Diffusion-TS (the latter trained jointly on normal and fault examples).

The evaluation framework comprises:

Context-FID: Local contextual Fréchet distance measuring authenticity.
Correlational Score: Error on cross-correlation matrices.
Discriminative Score: AUC of real vs synthetic classifier.
Predictive Score: TSTR (train-synthetic, test-real) forecast error.

The following table summarizes results on the industrial dataset (sequence length 24, lower is better):

Method	context-FID	Corr. Score	Disc. Score	Pred. Score
Cot-GAN	6.336	142.72	0.436	0.133
TimeGAN	7.025	137.12	0.438	0.137
TimeVAE	5.990	134.21	0.438	0.115
Diffusion-TS	6.728	117.44	0.465	0.135
Ours	6.081	127.37	0.415	0.131

On public TEP/DAMADICS datasets, FaultDiffusion achieves the lowest context-FID and predictive errors in 8/10 fault settings and provides competitive correlational and discriminative scores (Xu et al., 19 Nov 2025).

6. Ablation Studies and Downstream Utility

Ablation results on the industrial set indicate the criticality of both the adapter and the diversity loss. Removing both yields context-FID $= 10.47$ ; adapter only $= 7.35$ ; diversity only $= 8.21$ ; and the full model $= 5.12$ . For downstream time-series classification, training a 15-way classifier on synthetic data gives $0.8933$ accuracy ( $\sim 20\%$ higher than Diffusion-TS at $0.7413$) (Xu et al., 19 Nov 2025).

This suggests FaultDiffusion’s generative outputs are both diverse and task-relevant for subsequent fault diagnosis pipelines.

7. Significance and Architectural Insights

FaultDiffusion demonstrates that pre-training a diffusion backbone on abundant normal data, then fine-tuning a compact, residual adapter to model the fault-normal domain gap, is highly effective in few-shot settings. The lightweight fine-tuning is particularly advantageous when fault annotations are rare and costly, which is common in industrial contexts. The explicit diversity regularizer addresses a major limitation of conventional generative methods when data is scarce, promoting better intra-class variability in generated samples.

A plausible implication is that such adapter-based approaches, combined with self-supervised pretraining on normal data and explicit diversity constraints, may generalize to other rare-event synthesis tasks beyond fault diagnosis (Xu et al., 19 Nov 2025).

Markdown Report Issue Upgrade to Chat

References (1)

FaultDiffusion: Few-Shot Fault Time Series Generation with Diffusion Model (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to FaultDiffusion.