Instantiation-Net: 3D Mesh Reconstruction from Single 2D Image for Right Ventricle

Published 16 Sep 2019 in eess.IV, cs.CV, cs.LG, and stat.ML | (1909.08986v1)

Abstract: 3D shape instantiation which reconstructs the 3D shape of a target from limited 2D images or projections is an emerging technique for surgical intervention. It improves the currently less-informative and insufficient 2D navigation schemes for robot-assisted Minimally Invasive Surgery (MIS) to 3D navigation. Previously, a general and registration-free framework was proposed for 3D shape instantiation based on Kernel Partial Least Square Regression (KPLSR), requiring manually segmented anatomical structures as the pre-requisite. Two hyper-parameters including the Gaussian width and component number also need to be carefully adjusted. Deep Convolutional Neural Network (DCNN) based framework has also been proposed to reconstruct a 3D point cloud from a single 2D image, with end-to-end and fully automatic learning. In this paper, an Instantiation-Net is proposed to reconstruct the 3D mesh of a target from its a single 2D image, by using DCNN to extract features from the 2D image and Graph Convolutional Network (GCN) to reconstruct the 3D mesh, and using Fully Connected (FC) layers to connect the DCNN to GCN. Detailed validation was performed to demonstrate the practical strength of the method and its potential clinical use.

Abstract PDF Upgrade to Chat

Citations (9)

View on Semantic Scholar

Summary

The paper introduces an end-to-end Instantiation-Net that reconstructs high-resolution 3D meshes of the right ventricle from a single 2D MRI image using a DCNN and GCN architecture.
It achieves competitive performance with a mean 3D error near 2.2 mm, eliminating manual segmentation and complex parameter tuning.
The study emphasizes real-time feasibilty for intra-operative use and suggests potential extensions to broader 2D-to-3D reconstructions.

Instantiation-Net: 3D Mesh Reconstruction from Single 2D Image for Right Ventricle

Introduction and Problem Statement

This work addresses the challenge of reconstructing detailed 3D cardiac anatomy from a single 2D projection, focusing on the right ventricle (RV) as captured by MRI. Traditional navigation in robot-assisted Minimally Invasive Surgery (MIS) relies on limited 2D images, complicating intra-operative 3D interpretation—especially for real-time dynamic navigation. Previous approaches, such as registration-based 3D-2D alignment and KPLSR-based shape instantiation, suffer from high computational cost, static modeling, and dependency on manual segmentation and parameter tuning. Recently, deep models with end-to-end learning for point cloud recovery were introduced, but lacked mesh connectivity, limiting their downstream clinical utility.

Instantiation-Net Architecture

This paper proposes Instantiation-Net, a fully automatic end-to-end system mapping a single 2D MRI image to a high-resolution, anatomically accurate 3D mesh suitable for surgical guidance. The framework integrates:

DCNN feature extraction: A DenseNet-121 backbone (pre-trained on ImageNet) processes the 2D image and outputs a compact deep feature representation.
Fully Connected transition: FC layers project and transform the DCNN features for mesh decoding.
Graph Convolutional Network (GCN) mesh decoder: Four GCN layers perform hierarchical up-sampling and mesh vertex regression, preserving topological consistency and anatomical detail.

The workflow is visualized in (Figure 1).

Figure 1: Intuitive block-wise architecture of Instantiation-Net, depicting DCNN, FC, and GCN components for end-to-end mesh reconstruction.

The structure enables learning both appearance-based features from images and graph-structured spatial transformations for mesh generation. Chebyshev polynomial-based GCNs enable computationally efficient spectral filtering on mesh topologies.

Experimental Design

Data Acquisition

The dataset comprises short-axis and long-axis MRI scans from 27 subjects (18 normal, 9 HCM), each sampled at 19–25 cardiac phases, yielding 609 paired 2D images and 3D meshes. Ground-truth meshes are obtained via manual semi-automated segmentation and meshing routines (e.g., Meshlab), ensuring vertex-level correspondence and consistent connectivity.

Training Protocol

The framework is evaluated using strict patient-specific leave-one-out cross-validation. The loss is defined as mean per-vertex L1 error, and optimization employs SGD with momentum and decaying learning rate. Inference time is minimized to approximately 0.5 seconds per case, fulfilling practical requirements for intra-operative deployment, although total training time and resource usage are higher than less expressive models.

Results

Vertex-wise Error Analysis

Vertex-level reconstruction errors are consistently distributed, with slightly higher error at the mesh apex (base of RV), attribut-able to topological sparsity and limited observational data for these regions in MR slices.

Figure 2: Vertex-level spatial error maps for four mesh reconstructions; intensity encodes mm error, highlighting spatially homogeneous performance except at mesh extremities.

Temporal Error Robustness

Frame-wise analysis over 12 patients demonstrates that Instantiation-Net maintains mean errors near 2 mm throughout the cardiac cycle. Outlier time-points (e.g., end-systole/diastole) exhibit higher error due to underrepresentation in training and boundary effects in temporal dynamics.

Figure 3: Mean 3D distance error per cardiac frame across 12 subjects, demonstrating stability and highlighting frames affected by "boundary" phenomena.

Comparison with Baseline Methods

Instantiation-Net is directly compared with previous KPLSR and PLSR frameworks. On the full patient cohort, mean 3D error for Instantiation-Net is 2.21 mm, outperforming PLSR (2.38 mm) and closely approaching the performance of KPLSR (2.01 mm). The model yields consistent results across all subjects—without need for hand-crafted features or manual segmentation at inference, unlike the baselines.

Figure 4: Subject-wise mean 3D mesh reconstruction error for Instantiation-Net, PLSR, and KPLSR approaches; Instantiation-Net exhibits competitive accuracy and lower variance.

Discussion

The findings underscore several practical and theoretical implications:

End-to-end trainability and automation: Instantiation-Net eliminates dependency on manual segmentation and parameter tuning, streamlining deployment and reducing operator bias.
High-resolution mesh output: In contrast to models limited to unstructured point clouds or low-res volumes, the output is a topologically consistent mesh with vertex correspondence—vital for intra-operative navigation and simulation.
Computational/Resource Trade-offs: The main limitation lies in training resource demands (GPU memory and runtime), a trade-off for higher model expressivity and automation. However, inference speed is compatible with intra-operative requirements.
Boundary and sparsity effects: Performance is attenuated in regions with limited imaging support (apical/basal mesh), necessitating either more comprehensive imaging or auxiliary regularization in these regimes.
Generalizability: While validated on right ventricle MRI, the DCNN+GCN architecture is potentially extensible to 3D reconstruction tasks from limited 2D data in other anatomical or even non-medical settings—where mesh correspondence is crucial.

Future Directions

Key extensions include multi-view/multi-modal integration, domain adaptation for robustness across scanner protocols, and scaling to whole-organ reconstructions in complex anatomical regions. Incorporation of temporal priors or shape constraints may further ameliorate boundary effects and enable temporally consistent mesh sequences.

Conclusion

Instantiation-Net advances the state of the art for 3D mesh instantiation from minimal 2D imaging in cardiac applications, merging DCNN-based image feature extraction with GCN-based mesh decoding. Yielding mean errors on par with prior semi-automatic methods—and achieving full end-to-end differentiability and automation—this architecture facilitates accurate, efficient, and robust 3D reconstruction for clinical navigation and surgical planning. Its design paradigm is likely generalizable to broader 2D-to-3D instantiation challenges within medical AI and beyond.

Markdown Report Issue