3D-SPACE-MRC & 3D-REAL-IR for Inner Ear Imaging
- The paper demonstrates that 3D-SPACE-MRC and 3D-REAL-IR achieve high-resolution volumetric imaging of inner ear fluid compartments through an automated deep learning pipeline.
- It details how complementary contrasts and precise segmentation metrics, including Dice scores and ELR values, validate these protocols against clinical standards.
- This approach minimizes manual intervention and supports recalibration of clinical thresholds with physiologically plausible, reproducible volumetric measurements.
3D-SPACE-MRC (Sampling Perfection with Application optimized Contrasts using different flip angle Evolutions – Magnetic Resonance Cisternography) and 3D-REAL-IR (3D-Relaxation-Enhanced Anatomical Layer Imaging - Inversion Recovery) are high-resolution MRI modalities optimized for non-invasive volumetric visualization and quantification of fluid compartments within the inner ear, most prominently in the context of endolymphatic hydrops (EH). These sequences serve as the backbone of automated deep learning pipelines such as OREHAS, enabling reliable measurement of clinically meaningful metrics, most notably the endolymphatic-to-vestibular volume ratio (ELR), directly from routine whole-volume MRI acquisitions with minimal manual intervention (Fuster-Barceló et al., 26 Jan 2026).
1. Imaging Protocols and Technical Specifications
3D-SPACE-MRC and 3D-REAL-IR MRI are 3D acquisition protocols that provide complementary contrasts for resolving the vestibular and endolymphatic compartments, respectively. In the clinical implementation as referenced in OREHAS, SPACE-MRC volumes use an isotropic voxel size of mm³, while REAL-IR volumes are acquired with mm³ resolution. These volumetric acquisitions cover the full inner ear and facilitate slice selection, precise segmentation, and quantification workflows. No geometric resampling is required post-acquisition for volumetric analysis due to the isotropic or nearly isotropic voxel dimensions.
2. Clinical Motivation and the ELR Metric
The central diagnostic parameter derived from these imaging modalities is the ELR, defined as the ratio (in percent) of the volume of endolymphatic fluid to the volume of the vestibular cavity after 3D segmentation:
with
where is the physical voxel size (in mm³). Conversion to cm³ is by division by 1000; no further normalization is used. Literature-established clinical thresholds (derived from syngo.via software) posit ELR > 60% as “significant hydrops” and ELR < 30% as “non-significant hydrops.” OREHAS-generated ELR values cluster lower (mean ~20–25%) and never surpass 100%, establishing increased physiological plausibility and suggesting the necessity for recalibration of clinical cut-offs to the 30–35% range when using these pipelines (Fuster-Barceló et al., 26 Jan 2026).
3. Workflow: OREHAS Deep Learning Pipeline
The OREHAS framework operationalizes end-to-end volumetric quantification from routine 3D-SPACE-MRC and 3D-REAL-IR MRI in three modular stages:
- Slice Classification (EarGate): A CNN (either a custom 5-layer architecture or ResNet-50 fine-tuned from ImageNet) discards slices lacking inner-ear anatomy, facilitating computational focus and downstream robustness. Training employs patient-wise fivefold cross-validation, early stopping, and learning rate reduction on plateau. Post-processing ensures a single continuous region per ear. Accuracies reach 93–96% slice-level (F1=0.85 for SPACE, F1≈0.58 for REAL).
- Localization (AuriBox): A YOLOv5su-based object detector identifies left and right vestibular compartments, guided by clinician-annotated cochlear centroids and fixed bounding-boxes. Training utilizes SGD with extensive data augmentation (Mosaic, random perspective), yielding spatial accuracy of 100%/98.7% (SPACE) and 100%/98.9% (REAL) at IoU≥0.5.
- Sequence-Specific Segmentation (EHMasker): A 2D U-Net (encoder-decoder, skip connections, GroupNorm, LeakyReLU, and Kaiming initialization) is trained on 96×96 pixel patches, centered on detected ROIs, with only 3–6 expert-annotated slices per ear required. The optimal loss is a BCE+Dice hybrid, and threshold binarization is set to (SPACE) and (REAL). Post-processing stacks slices into 3D masks, removes small components, and retains the largest structure.
The modular architecture allows components to be independently retrained or replaced as advances occur. The pipeline is compatible with open-source deployment strategies and is designed for methodological transparency and efficiency (2 min/patient on a single GPU).
4. Performance Metrics and Comparative Analysis
Quantitative performance for 3D-SPACE-MRC and 3D-REAL-IR in the OREHAS pipeline is established via:
- 2D Segmentation Overlap:
- Dice (SPACE): 0.91 (best)
- Dice (REAL): 0.75 (best)
- IoU (SPACE): 0.83; IoU (REAL): 0.56
- Recall (SPACE): 0.92; Recall (REAL): 0.74
- Volumetric Comparison (N=19 patients, mean ± std):
| Modality | Syngo.via | OREHAS |
|---|---|---|
| Vestibule (SPACE) | 0.084 ± 0.013 cm³ | 0.084 ± 0.010 cm³ |
| Endolymph (REAL) | 0.043 ± 0.020 cm³ | 0.019 ± 0.008 cm³ |
| ELR (Right ear) | 46.0 % ± 22.2 % | 20.2 % ± 9.0 % |
| ELR (Left ear) | 57.6 % ± 26.1 % | 25.7 % ± 11.6 % |
- External Validation (N=5, Volumetric Similarity Index VSI):
| Structure / Modality | Pred (OREHAS) | Syngo.via | GT | VSI OREHAS | VSI syngo.via |
|---|---|---|---|---|---|
| Vestibule (SPACE, mm³) | 92.5 ± 16.9 | 87.0 ± 11.6 | 69.5 ± 9.7 | 86.1 % ± 4 % | 88.8 % ± 4 % |
| Endolymph (REAL, mm³) | 22.9 ± 18.4 | 51.4 ± 32.3 | 16.3 ± 12.6 | 74.3 % ± 22 % | 42.5 % ± 17 % |
OREHAS outperforms the widely used syngo.via software in ELR quantification accuracy, particularly for endolymphatic volume, which syngo.via systematically overestimates and at times produces physiologically impossible ELR values (>100%).
5. Data and Training Requirements
OREHAS achieves generalizable segmentation and robust ELR quantification with minimal annotation, requiring only 3–6 expertly labeled slices per ear (total training set: 437 SPACE, 409 REAL patches). The framework generalizes to full 3D volumes without additional retraining on unannotated data. Data-intensive manual workflows employing slice-by-slice annotation (∼8 slices/ear) and proprietary interpolation, as in syngo.via, are thus rendered unnecessary, providing speed and reproducibility advantages.
6. Practical Implications and Methodological Considerations
The ability to fully automate EH quantification from 3D-SPACE-MRC and 3D-REAL-IR MRI yields methodological consistency and reduced operator dependence. The robust, physiologically plausible volumetric outputs align with clinical practice and established imaging protocols. Recommendations derived from OREHAS findings include:
- Recalibration of clinical ELR thresholds when using deep-learning-based segmentation, due to systematically lower and more plausible ELR values.
- Modular pipeline design enabling targeted retraining as imaging protocols or segmentation architectures evolve.
- Compatibility with large-scale studies for establishing new diagnostic standards and thresholds, based on accurate volumetric measurements of the inner ear (Fuster-Barceló et al., 26 Jan 2026).
A plausible implication is that general adoption of automated, open-source segmentation and quantification from 3D-SPACE-MRC and 3D-REAL-IR may set new clinical baselines and facilitate large, cross-institutional analyses of inner ear fluid pathologies. Further multi-center validation and clinical threshold adjustment are areas of ongoing development.