USFetal Compounding Toolbox
- USFetal Compounding Toolbox is a modular, open-source suite of algorithms designed for multi-view ultrasound compounding in fetal brain imaging.
- It integrates multi-scale, transformation-based, variational, and deep learning approaches to fuse 3D ultrasound volumes and suppress view-dependent artifacts.
- The toolbox standardizes preprocessing, evaluation, and benchmarking protocols, offering reproducible workflows for robust anatomical reconstruction.
The USFetal Compounding Toolbox is a modular, open-source suite of algorithms and utilities for multi-view ultrasound compounding, designed specifically for fetal brain imaging. It implements and benchmarks a comprehensive set of computational strategies—including multi-scale, transformation-based, variational, and deep learning approaches—to solve the core challenge of integrating multiple, registered 3D ultrasound volumes into a single, artifact-suppressed, anatomically faithful volumetric representation. This toolbox is motivated by the inherent limitations of ultrasound, such as view-dependent artifacts and restricted field of view, and by the absence of artifact-free ground truth data for supervised learning. It supports both classical image fusion and modern self-supervised and unsupervised deep learning paradigms and exposes a standardized workflow and extensible evaluation protocol (Khateri et al., 11 Jan 2026).
1. Algorithmic Strategies for Compounding
The toolbox encapsulates four conceptually distinct families of compounding algorithms, each defined by its mathematical formulation, data fidelity and feature consistency objectives, and regularization mechanisms:
- Multi-scale Fusion: Each input volume is decomposed into spatial frequency bands using Difference-of-Gaussians (DoG) or Laplacian/Gaussian pyramids. Fusion is performed bandwise via averaging or maximum selection, followed by reconstruction. This approach emphasizes local edge information but is susceptible to amplifying noise and does not guarantee global anatomical consistency. Default scales include , yielding three detail bands, and a band gain of $3.0$.
- Transformation-based Fusion: At every voxel, the registered intensities form a vector . Principal Component Analysis (PCA) is applied to decorrelate and concentrate shared information, which is then fused (typically by retaining the top components explaining of variance) and reconstructed. This method is globally efficient but linear and can attenuate fine structure.
- Variational Fusion: Compounding is posed as an energy minimization problem, seeking that balances voxel-level data fidelity, feature-level consistency (using a differentiable operator , such as DoG), and a regularization term (total variation or sparsity):
Typical parameters are , , , with optimization performed by AdamW over 100 iterations.
- Deep Learning Approaches:
The toolbox implements two unsupervised/self-supervised approaches: - Self-Supervised Multi-view U-Net: A multi-view 3D U-Net, sharing encoder weights across inputs, uses Restormer-style fusion (Mixture-of-Experts with gating and self-attention) and a DoG edge extractor skip connection. The network is trained to minimize the sum of voxelwise difference and DoG-based feature consistency across all registered views. - Unsupervised Plug-and-Play (PnP) Priors: Each cross-sectional slice of the voxelwise mean is regularized using a pretrained 2D Gaussian DRUNet denoiser, iteratively alternating between fidelity and denoising steps across three anatomical planes. No supervised learning is required.
These strategies are benchmarked and can be selected flexibly for specific imaging scenarios (Khateri et al., 11 Jan 2026).
2. Mathematical Foundations and Loss Formulations
Precise mathematical formulations underpin all compounding modes:
- Variational Objective: Formally combines voxelwise fidelity, feature-level consistency, and regularization (typically total variation), allowing direct encoding of prior anatomical information or smoothness assumptions.
- Self-Supervised Loss (SSL): For a set of registered inputs , the compounder output is optimized via
with typical settings and as DoG.
- PnP Plug-and-Play Scheme: Iteratively alternates a fidelity step (blending with the slice-wise mean) and Gaussian denoising via DRUNet, applied slice-by-slice along all anatomical planes and aggregated.
- Multi-scale Pyramid Compounding (Hung & Galeotti): Utilizes Laplacian and Gaussian pyramids for multi-level, per-pixel fusion, guided by intensity and structural confidence maps, maximizing boundary contrast while suppressing artifacts (Hung et al., 2020).
These objectives enable unsupervised learning and artifact suppression in the absence of a clean reference compound.
3. Detailed Workflow and Implementation
All methods require as input a set of $2$–$8$ pre-aligned, brain-masked 3D ultrasound volumes of shape , optionally with associated confidence maps or edge masks.
- Pre-processing:
- Rigid/non-rigid registration to a common template (MRI or ultrasound).
- Brain masking.
- Optional intensity normalization (zero-mean, unit variance).
- Compounding Pipeline:
- Multi-scale/Pyramid-based: Decompose (DoG or Laplacian/Gaussian), fuse bands, reconstruct.
- Variational: Initialize, then minimize energy functional via AdamW.
- Deep learning (SSL): Sliding window of voxels (overlap 48), batch size 6, 3D U-Net inference with patch-wise tiling; training up to 100 epochs.
- PnP: 10 iterative fidelity-denoise cycles per anatomical plane.
- Post-processing:
- Intensity rescaling.
- Optional median filtering.
- Evaluation using mutual information (MI), SSIM, entropy, and correlation coefficient (CC).
- Runtime:
- Multi-scale/PCA: seconds per subject.
- Variational: 1–2 min (100 iters).
- PnP: 1 min.
- SSL inference: 30 s; training: 1–2 h.
- All methods are patch-parallelizable and CPU/GPU capable; GPU inference is 3–5 faster.
4. Artifact Suppression and Boundary Preservation
Artifact reduction and preservation of anatomical boundaries are central design criteria:
- Hung & Galeotti’s PyramidFuse Module:
- Constructs Gaussian and Laplacian pyramids per view with associated intensity and structural confidence maps.
- At each pyramid level, fusion employs local contrast and confidence-driven selection (using a threshold to discriminate artifact-suspect regions). A mid-level-specific Gaussian weighting function determines the blend between maximum-contrast and confidence-weighted average pyramids.
- Boundary enhancement at a designated pyramid level leverages binary masks of “good” anatomical boundaries, refined through horizontal edge detection, clustering, and flood-fill, to enforce high-fidelity boundary reconstruction.
- Final reconstruction sums expanded levels of the fused Laplacian pyramid.
- Quantitative evaluation demonstrates improved artifact suppression (lower AMR/AVR in artifact patches) and enhanced boundary clarity (higher boundary AVR, Dice coefficient) compared to both classical and prior state-of-the-art compounding (Hung et al., 2020).
5. Usage Protocols, Practical Guidance, and Toolbox Design
Data Preparation and Method Selection
- Inputs must be brain-masked, registered volumes in NIfTI or NumPy format.
- Consistent gain and depth settings across views and at least orientation difference between acquisitions are recommended.
- Parameter regimes for each method are exposed via a single configuration file.
Extensibility and Modularity
- The toolbox is organized as independent Python modules per compounding strategy:
multi_scale/,transformation/,variational/,dl/pnp/,dl/ssl/, each with a standardcompound()API. - A YAML/JSON configuration scheme and command-line driver script facilitate reproducible experimentation and easy integration of new approaches.
- Evaluation scripts enable standardized quantitative and expert qualitative assessment.
Runtime and Hardware
- Designed for operation on single-GPU laptops, with fallback CPU support.
- Memory requirements are modest: typically MB per subject for classical methods; deep learning methods require standard modern GPU memory.
6. Evaluation Metrics and Expert Scoring
- Quantitative Image Quality:
- Mutual Information (MI), SSIM, entropy, and correlation coefficient (CC) computed between compound and each input, then averaged.
- Specialized artifact and boundary metrics: AMR, AVR for artifact and boundary patches (as defined in (Hung et al., 2020)).
- Dice coefficient for segmentation overlap in vessel patches.
- Expert Assessment:
- Three-point ordinal scoring (good/fair/poor) across key neuroanatomical structures and motion artifacts.
- Within-subject ranking of methods.
These multi-faceted metrics support comprehensive benchmarking for algorithm development and clinical validation.
7. Integration and Extension Considerations
- Each strategy is implemented as a modular Python component with consistent I/O and configuration handling.
- Custom algorithms can be added by subclassing the existing interface structures.
- The open-source codebase includes scripts for tiling/patching, data I/O, preprocessing, and evaluation, facilitating adoption and further method development.
- This modularity and openness are intended to support ongoing benchmarking and innovation in fetal ultrasound reconstruction (Khateri et al., 11 Jan 2026).
References:
- "USFetal: Tools for Fetal Brain Ultrasound Compounding" (Khateri et al., 11 Jan 2026)
- "Good and Bad Boundaries in Ultrasound Compounding: Preserving Anatomic Boundaries While Suppressing Artifacts" (Hung et al., 2020)