Diffusion Bridge Models for 3D Medical Image Translation

Published 21 Apr 2025 in cs.CV | (2504.15267v1)

Abstract: Diffusion tensor imaging (DTI) provides crucial insights into the microstructure of the human brain, but it can be time-consuming to acquire compared to more readily available T1-weighted (T1w) magnetic resonance imaging (MRI). To address this challenge, we propose a diffusion bridge model for 3D brain image translation between T1w MRI and DTI modalities. Our model learns to generate high-quality DTI fractional anisotropy (FA) images from T1w images and vice versa, enabling cross-modality data augmentation and reducing the need for extensive DTI acquisition. We evaluate our approach using perceptual similarity, pixel-level agreement, and distributional consistency metrics, demonstrating strong performance in capturing anatomical structures and preserving information on white matter integrity. The practical utility of the synthetic data is validated through sex classification and Alzheimer's disease classification tasks, where the generated images achieve comparable performance to real data. Our diffusion bridge model offers a promising solution for improving neuroimaging datasets and supporting clinical decision-making, with the potential to significantly impact neuroimaging research and clinical practice.

Abstract PDF Upgrade to Chat

Summary

The paper presents a novel diffusion bridge model that effectively translates 3D T1w MRI to DTI-derived FA images with high anatomical fidelity.
The methodology leverages a modified UNet architecture with scale-shift normalization and tailored attention mechanisms to process 3D data.
Experimental results demonstrate robust performance in clinical tasks, including Alzheimer's classification, validated by high MS-SSIM, PSNR, and MMD metrics.

Diffusion Bridge Models for 3D Medical Image Translation

Introduction

The paper "Diffusion Bridge Models for 3D Medical Image Translation" (2504.15267) introduces a novel approach to bridge the modality gap between T1-weighted (T1w) MRI and diffusion tensor imaging (DTI). DTI is particularly valued for its insights into the microstructure of the human brain but is more cumbersome to acquire than T1w MRI. The proposed diffusion bridge model facilitates high-quality generation of DTI fractional anisotropy (FA) images from T1w images and vice versa, providing a practical solution that mitigates the need for extensive DTI acquisition. Validation through metrics focused on perceptual similarity and clinical task performance, such as sex and Alzheimer's disease classification tasks, highlights the effectiveness and potential clinical utility of the model.

Figure 1: Overall framework of diffusion bridge models for 3D medical image translation.

Methodology

Problem Formulation

The task involves translating paired 3D medical image data across modalities. The paper formulates this as learning a conditional distribution mapping from source images (T1w MRI) to target images (DTI-derived FA maps). Given a dataset of paired images $\{x_1^i, x_0^i\}$ , the training goal is to model the distribution $\pi_{0|1}(x_0|x_1)$ to predict unseen target images from new source images efficiently.

Diffusion Bridge Models

The diffusion bridge model constructs a stochastic process $p_t$ connecting source and target distributions using a Gaussian transition kernel:

$p_{t|0,1}(x_t|x_0, x_1) = \mathcal{N}(\alpha_t x_0 + \beta_t x_1, \gamma_t^2 \mathbb{I}),$

where the coefficients $\alpha_t$ , $\beta_t$ , and $\gamma_t$ ensure seamlessly connecting the desired distributions at boundary conditions (e.g., $\alpha_0 = \beta_1 = 1$ ).

Training involves approximating the unknown $\hat{x}_0^*(t, x_t, x_1)$ using a denoiser $\hat{x}_0^{\theta}$ . The paper leverages a UNet architecture with modifications to accommodate 3D data effectively. These adjustments include enhanced conditioning, scale-shift normalization, and attention mechanisms tailored for medical imaging tasks.

Experiments

Data and Preprocessing

Data from 1,114 participants of the Alzheimer's Disease Neuroimaging Initiative (ADNI) were utilized, following preprocessing steps such as skull stripping and registration. Normalization of image intensity values was applied, standardizing the dataset for training and evaluation using diffusion bridge models.

Evaluation Metrics

The evaluation metrics included:

MS-SSIM: Measures perceptual similarity between generated and reference images.
PSNR: Quantifies pixel-wise differences.
MMD: Computes distributional consistency between image sets.

The robustness of the synthetic images against these metrics supports the model's ability to preserve anatomical fidelity and structural integrity across neuroimaging modalities.

Figure 2: Image translation from T1 to FA with 3 subjects representing healthy controls, MCI, and Alzheimer's cases.

Results

Achievements in Image Translation

The diffusion bridge models exhibit high MS-SSIM scores (> 0.9) across different anatomical views (axial, sagittal, coronal), affirming strong anatomical detail retention.

Figure 3: 2D MS-SSIM between real and synthetic FA images across different views.

Figure 4: 3D MS-SSIM, PSNR and MMD evaluation between real and synthetic FA images across 167 subjects.

Downstream Tasks

In terms of downstream task performance, synthetic data generated by the SDE sampler demonstrated substantial accuracy in Alzheimer's disease classification, comparable to real data. However, sex classification revealed slightly diminished performance, indicating variability in anatomical detail preservation.

The deterministic nature of ODE-sampled images supports reproducibility, although stochastic sampling enhances reconstruction quality.

Conclusion

The diffusion bridge model is a potent tool for cross-modality image translation, generating anatomically consistent images by preserving white matter pathway integrity. Its potential applications span neuroimaging research and clinical practice, offering a viable alternative in settings constrained by time or resource limitations.

Future directions include expanding the dataset size and diversity, enhancing model capacity, and exploring synthesis across other pathologies and modalities. The integration of the diffusion bridge framework with advanced imaging paradigms, such as pan-contrast MRI synthesis or ODF generation, could further refine neuroimaging solutions impacting clinical and research domains.

Markdown Report Issue