3DTeethSeg'22 Challenge Overview
- The 3DTeethSeg'22 Challenge is a global event that benchmarks automated 3D dental scan analysis tasks including tooth localization, segmentation, and labeling.
- It leverages the extensive Teeth3DS+ dataset with 1,800 high-fidelity intraoral scans and detailed hybrid human–machine annotations for robust evaluation.
- State-of-the-art algorithms, evaluated with metrics like IoU, Dice, and localization accuracy, have advanced computer-aided dentistry and informed clinical translation.
The 3DTeethSeg'22 Challenge is a prominent international benchmarking event focused on the automatic analysis of intraoral 3D scans for dental applications, particularly targeting teeth localization, segmentation, and labeling. Initiated in conjunction with MICCAI 2022, it established a standardized dataset, rigorous evaluation protocols, and comparative benchmarks for data-driven advances in computer-aided dentistry, fostering progress in both technical methodology and clinical translation (Ben-Hamadou et al., 2022, Ben-Hamadou et al., 2023).
1. Dataset Construction and Properties
The challenge centers around Teeth3DS+, a large-scale, public dataset designed specifically to address longstanding limitations in the field—namely, the absence of high-quality, consistently annotated 3D dental scan resources. The dataset comprises 1,800 intraoral 3D scans collected from 900 anonymized patients, with each subject contributing two scans (upper and lower jaws). Data were acquired using industry-standard intraoral scanners: Primescan (Dentsply), Trios 3 (3Shape), and iTero Element 2 Plus. These devices deliver high fidelity (10–90 μm accuracy) and variable point densities (30–80 points/mm²), encompassing clinically realistic levels of anatomical variability, patient age (70% under 16, 27% between 16–59, 3% over 60), and gender (50% male, 50% female) (Ben-Hamadou et al., 2022, Ben-Hamadou et al., 2023).
Annotation was executed via a hybrid human–machine approach, involving an eight-step pipeline:
- Mesh cleanup—elimination of degenerate faces and vertex duplicates.
- Pose normalization by PCA alignment to the occlusal plane.
- Cropping and harmonic parameterization for UV mapping of tooth regions.
- Manual boundary annotation on UV images by trained experts.
- Back-propagation of these 2D labels onto original 3D meshes.
- Assembly into a unified 3D coordinate frame for each jaw.
- Labeling using the FDI World Dental Federation code for permanent dentition (labels 11–48; with 0 assigned to gingiva).
- Clinical review and iterative correction by orthodontists or dental surgeons (>5 years’ experience), ensuring verification and quality control at every level.
Each scan is provided in OBJ mesh format (vertices and faces), supplemented with a JSON annotation file including both per-vertex “labels” (FDI code or 0) and “instances” (tooth instance IDs), with separate splits for training (1,200 scans; 16,004 teeth) and test (600 scans; 7,995 teeth) (Ben-Hamadou et al., 2022).
2. Challenge Tasks and Definitions
The 3DTeethSeg'22 Challenge formalized three main subtasks:
- Tooth Localization: Detection of the 3D position (centroid) of each visible tooth.
- Tooth Segmentation: Assignment of each mesh vertex to a semantic class (32 tooth labels, following FDI notation, plus gingiva). This constitutes joint semantic and instance segmentation.
- Tooth Identification (Labeling): Assignment of the correct FDI label to each segmented tooth instance.
Auxiliary but non-scored objectives included gingiva boundary refinement, detection of dental landmarks (roots, cusps), and shape modeling from partial or noisy input data. All outputs were standardized as per-vertex annotations (labels/instances arrays), supporting direct comparison and downstream analysis (Ben-Hamadou et al., 2022, Ben-Hamadou et al., 2023).
3. Evaluation Metrics and Protocols
Quantitative performance evaluation leveraged a multifaceted metric suite, including:
- Intersection over Union (IoU): for comparing predicted and ground-truth sets.
- Dice (F1) Coefficient: as a harmonic mean of precision and recall.
- Teeth Localization Accuracy (TLA): Normalized Euclidean distance between predicted and ground-truth centroids, then for challenge ranking.
- Teeth Segmentation Accuracy (TSA): Mean F1-score per tooth.
- Teeth Identification Rate (TIR): Fraction of correctly labeled teeth, where success requires correct FDI assignment and proximity within of the annotated center.
- Aggregation: Per-tooth performance averaged across all teeth and all scans.
Additional metrics such as precision, recall, Hausdorff distance, and boundary smoothness (e.g., average surface normal deviation) were referenced for method development and qualitative assessment (Ben-Hamadou et al., 2023, Ben-Hamadou et al., 2022, Li et al., 29 Dec 2025).
4. Algorithmic Approaches and Baselines
The challenge attracted six international teams with diverse methodological innovations:
- CGIP: Two-stage Point Transformer pipeline with semantic grouping, DBSCAN clustering, and contrastive boundary learning—achieved highest segmentation accuracy.
- FiboSeg: Multi-view 2D rendering (RGB depth & normals), 2D ResU-Net segmentation, and mesh-face label projection via weighted voting—excelled in localization.
- IGIP: Multi-stage point cloud pipeline (tooth/gingiva binary classification, centroid regression, patchwise segmentation/classification, and dental arch post-processing for FDI consistency).
- TeethSeg: 3D U-Net semantic segmentation on voxelized models, refined by Random Walker mesh-based instance boundary estimation.
- OS: HRNet-based centroid prediction from top-view rasterizations, graph-constrained CNN, and graph-cut refinement.
- Champers: Stratified Transformer for centroid prediction and proposal segmentation with k-NN and cascaded transformer blocks.
Public baselines provided by the organizers employed multi-scale mesh CNNs (MeshSNet-like architectures), various point cloud networks (PointNet++, PointTransformer), and 2D U-Nets on parameterized UV maps, with a shared pipeline of preprocessing (PCA, downsampling), segmentation, and postprocessing (connected component analysis, embedding-based FDI assignment) (Ben-Hamadou et al., 2023, Ben-Hamadou et al., 2022).
State-of-the-art and challenge-winning methods further advanced the pipeline. SOFTooth, for example, fused frozen Segment Anything Model (SAM) 2D semantics via point-wise residual gating, combined with center-guided mask refinement and order-aware Hungarian matching, achieving the best overall accuracy (OA 96.12%, mIoU 88.99%) and excelling on rare or difficult cases such as third molars (Li et al., 29 Dec 2025).
Table: Summary of Top 3DTeethSeg’22 Results (Test Set)
| Team/Method | Segmentation Accuracy (TSA) | Localization (Exp(-TLA)) | Identification Rate (TIR) | Final Score |
|---|---|---|---|---|
| CGIP | 0.9859 | 0.9658 | 0.9100 | 0.9539 |
| FiboSeg | 0.9293 | 0.9924 | 0.9223 | 0.9480 |
| IGIP | 0.9750 | 0.9244 | 0.9289 | 0.9427 |
| SOFTooth | 88.99% mIoU | - | - | - |
5. Technical Challenges, Insights, and Failure Modes
Segmentation and labeling from intraoral 3D scans remain challenging due to:
- High inter-patient variation in dental arch morphology, tooth size/shape, and eruption patterns.
- Frequent cases of crowded, overlapping, or missing teeth, which induce merged or fragmented instances.
- Ambiguous tooth–gingiva boundaries, exacerbated in the presence of appliances or partial scans.
- Symmetry in tooth anatomy (e.g., left/right molars, incisors), leading to label confusion.
Methodological insights include the efficacy of multi-stage (localize→crop→segment→label) pipelines over monolithic networks, explicit modeling of boundary regions (contrastive loss, curvature features, Random Walker), and the incorporation of geometric priors (arch fitting) for improved FDI labeling. Representative failure modes observed were missing/undetected small teeth, over-segmentation at inter-dental contacts, and mislabeling of symmetric pairs (Ben-Hamadou et al., 2023, Ben-Hamadou et al., 2022, Li et al., 29 Dec 2025).
6. Advances and Future Directions
The introduction of 3DTeethSeg’22 catalyzed several key advances:
- Establishment of the first large-scale, public, high-quality 3D dental benchmark, enabling reproducible algorithmic comparison.
- Acceleration of next-generation deep learning techniques for dental CAD—particularly mesh-based, point cloud-based, transformer, and semantics-fusion models.
- Identification of open technical challenges, including robust handling of missing/supernumerary teeth, fine-scale boundary segmentation (especially the root-gingiva interface), and temporal tracking across treatment sequences.
Future benchmarks (e.g., 3DTeethLand'24) will target automatic detection of dental landmarks, fine-grained caries/enamel defect classification, and integration of multimodal data (combining mesh, CBCT, radiographs, and photography). Methodological frontiers include efficient self-/semi-supervised learning, foundation-model adaptation for 3D, and robust real-time deployment in clinical CAD environments (Ben-Hamadou et al., 2022, Li et al., 29 Dec 2025, Ben-Hamadou et al., 2023).
7. Impact on the Field
By standardizing task definitions, metrics, and public datasets, the 3DTeethSeg'22 Challenge and its associated Teeth3DS+ resource filled a foundational gap in dental CAD research. They enable transparent, reproducible evaluation, foster broad methodological innovation, and indirectly promote clinical adoption of automated, highly accurate 3D dental scan analysis. The challenge also highlighted the limitations of current algorithms in handling real-world clinical diversity, motivating ongoing efforts to expand datasets, incorporate more challenging anatomical and pathological cases, and refine both evaluation protocols and clinical integration pathways (Ben-Hamadou et al., 2022, Ben-Hamadou et al., 2023, Li et al., 29 Dec 2025).