Tada-DIP: Adaptive 3D Reconstruction
- Tada-DIP is an input-adaptive deep image prior framework for one-shot 3D image reconstruction, leveraging denoising regularization and an adaptive input mechanism to improve reconstruction quality.
- It employs a 3D U-Net with exponential moving average updates to stabilize optimization and mitigate overfitting in sparse-view CT setups.
- Quantitative evaluations show that Tada-DIP achieves reconstruction quality on par with supervised methods while avoiding artifacts common in traditional iterative approaches.
Tada-DIP refers to the “Input-adaptive Deep Image Prior” framework for one-shot three-dimensional (3D) image reconstruction from sparse or incomplete data, specifically tailored for inverse problems such as sparse-view computed tomography (CT). Developed as an advance over standard Deep Image Prior (DIP) methods, Tada-DIP achieves high-fidelity 3D reconstruction by augmenting the DIP paradigm with input adaptation and denoising-based regularization. These innovations yield reconstructions matching supervised deep networks, while retaining the training-data-free, zero-shot nature of classical DIP approaches (Bell et al., 3 Dec 2025).
1. The Inverse Problem and Baseline DIP Limitations
Tada-DIP addresses 3D linear inverse problems defined by
where is the forward operator (e.g., parallel-beam CT projection), is an unknown 3D volume (e.g., ), and for sparse-view scenarios. The task is to recover from without access to a large corpus of ground-truth 3D scans. The baseline DIP approach leverages an untrained convolutional neural network (CNN), , mapping a fixed noise vector to a candidate image: . Reconstruction proceeds via
with the final estimate . However, this setup is prone to overfitting, especially with high-dimensional and limited measurements, and struggles to deliver state-of-the-art 3D reconstructions as seen in 2D cases (Bell et al., 3 Dec 2025).
2. Loss Function and Regularization
Tada-DIP enhances the DIP loss by introducing two key modifications:
- Generalization to data-fidelity: the loss incorporates with , empirically preferring for artifact suppression.
- Denoising regularization: a “denoising” penalty encourages the network to map noisy inputs onto stable outputs, mitigating overfitting.
The iterative training objective at step is: where
- is the input at iteration
- is injected input noise
- is a tunable regularization parameter.
This regularization counteracts the noise-fitting behavior of vanilla DIP, instead biasing reconstructions toward stable, denoised solutions (Bell et al., 3 Dec 2025).
3. Input Adaptation Mechanism
In contrast to traditional DIP, which fixes , Tada-DIP implements an adaptive input scheme: where typically . This exponential moving average update propagates reconstruction improvements into the latent code, stabilizing optimization trajectories and accelerating convergence. The input adaptation synergizes with denoising regularization to suppress noise amplification and “null-space” artifacts commonly plaguing DIP in severely ill-posed regimes (Bell et al., 3 Dec 2025).
4. Algorithmic and Architectural Details
Tada-DIP employs a 3D U-Net as , featuring:
- Five encoder–decoder levels
- Double 3×3×3 convolutional layers per level (ReLU activations)
- Channel widths doubling per spatial downsampling (32–512)
- 2×2×2 max-pooling for encoding, mirrored by transposed convolutions and skip connections in decoding
- Final 1×1×1 convolution with sigmoid output
- Omission of batch normalization to ensure broad applicability across datasets
The optimization proceeds for 30,000–50,000 Adam steps (learning rate ). Input noise is drawn per iteration with standard deviation ; recommended values are . The loss balances fidelity (to the measured data) and denoising, allowing empirical tuning of in . The final reconstruction is produced via an exponential moving average of the last several outputs (Bell et al., 3 Dec 2025).
| Method | Data-Fidelity Norm | Denoising Reg. | Input Adaptation | Network |
|---|---|---|---|---|
| Vanilla DIP | None | None | 3D U-Net | |
| Tada-DIP | or | Yes | Yes | 3D U-Net |
5. Quantitative and Qualitative Evaluation
Tada-DIP was evaluated on sparse-view volumetric CT, specifically the Mayo Clinic LDCT dataset (209 volumes). On held-out scans with 30-view acquisition:
- Filtered Backprojection (FBP): 29.09 dB PSNR / 0.623 SSIM
- TV-based ASD-POCS: 35.02 dB / 0.906
- Vanilla DIP (3D): 37.74 dB / 0.923
- Supervised 2D U-Net: 39.72 dB / 0.955
- Tada-DIP: 39.73 dB / 0.955
For only 15 views:
- FBP: 24.40 dB / 0.446
- TV: 30.58 dB / 0.809
- Vanilla DIP: 32.40 dB / 0.815
- Supervised U-Net: 35.66 dB / 0.918
- Tada-DIP: 35.63 dB / 0.906
Tada-DIP consistently surpasses all training-free baselines by ≥2 dB and demonstrates parity with supervised methods trained on thousands of 2D slices. Reconstructions are largely artifact-free and preserve fine anatomical structure, while training loss curves show no overfitting over 50,000 optimization steps, a divergence from usual DIP tendencies (Bell et al., 3 Dec 2025).
6. Failure Cases, Generalizations, and Practical Usage
Tada-DIP reconstructions avoid the typical failures of both FBP (streaking) and total variation (blockiness) as well as the overfitting and noise amplification of DIP under extended training. The integration of denoising regularization and adaptive inputs is central to its robustness. Algorithmic extensions and recommendations for other 3D inverse problems include:
- Substitute the forward operator as needed (e.g., Fourier for MRI, attenuated Radon for PET)
- Adjust input noise scale and for new modalities
- Use a patch-based strategy for extremely high-resolution volumes to manage GPU memory
- For multi-coil MRI setups, initialize to coil-combined zero-filled images
A plausible implication is that further integration of dataset-specific regularizers or learned forward operators could improve generalization beyond the settings explored in (Bell et al., 3 Dec 2025).
7. Representative Significance and Future Prospects
Tada-DIP exemplifies a new regime for 3D, zero-shot image reconstruction where the network architecture itself encodes powerful priors, mitigated by explicit denoising pressure and dynamic input strategy. The methodology achieves supervised-level reconstruction quality without access to external data, setting a benchmark for the next generation of training-free approaches to volumetric inverse problems. Given its modular loss structure and network-agnostic architecture, Tada-DIP is readily extensible to MRI, PET, and beyond, contingent on appropriate adaptation of and hyperparameters. Its design defines the state of the art for training-free 3D inverse solvers (Bell et al., 3 Dec 2025).