Fair-Eye Net in Glaucoma AI
- Fair-Eye Net is an end-to-end multimodal AI system that integrates fundus images, OCT metrics, and clinical data to enable early glaucoma risk alerting.
- It employs dual-stream heterogeneous fusion and uncertainty-aware hierarchical gating to enhance prediction reliability and ensure clinical safety.
- The system achieves robust screening performance with high AUC values and reduces false-negative rate disparities by 73.4% through subgroup-specific calibration.
Fair-Eye Net is an end-to-end multimodal artificial intelligence system designed to advance the clinical continuum for glaucoma care, incorporating fair prediction, longitudinal prognosis, and early risk alerting. Developed to address the challenges of fragmented care and inequity from traditional screening—where methods rely on single or disconnected modalities interpreted subjectively—Fair-Eye Net integrates physical images, structural and functional measurements, and demographic data through heterogeneous fusion and principled uncertainty-aware decision making. Its architecture and workflow explicitly optimize for subgroup fairness, clinical reliability, and scalability, offering a path toward reproducible, large-scale deployment in ophthalmic practice (Wei et al., 26 Jan 2026).
1. System Motivation and Workflow
Glaucoma, a leading cause of irreversible blindness, requires early and accurate detection to prevent permanent optic nerve injury. Standard clinical workflows rely on isolated interpretation of fundus photography, optical coherence tomography (OCT) for retinal nerve fiber layer thickness, and visual field (VF) testing, yielding subjective grading and fragmented records. These limitations are exacerbated by uneven access to devices and specialists. Fair-Eye Net operationalizes a unified, automated pipeline composed of six sequential steps: multimodal data acquisition (fundus photos with RNFL heatmaps, OCT metrics, VF indices, demographic/risk factors), dual-stream feature extraction and fusion, hierarchical uncertainty-aware gating for prediction and referral, multitask outputs (binary screening and progression regression), dynamic risk alerting over follow-up intervals, and explicit fairness calibration through subgroup-specific thresholds.
2. Dual-Stream Heterogeneous Fusion Architecture
The system ingests two primary data modalities: (fundus photograph with OCT RNFL-thickness heatmap augmentation as pseudo-RGB input) and (tabular clinical metadata: OCT metrics, VF indices, and demographic factors). Extraction proceeds along two dedicated feature pipelines:
- Visual Stream (): A ResNet-50 backbone, pretrained on 12,449 fundus images (SMDG-19), yields a 2048-dimensional latent vector.
- Clinical Stream (): A Densely Connected Clinical Encoder (DCCE)—a DenseNet-inspired MLP with fully connected layers and growth rate —produces the clinical feature embedding.
Fusion is enacted at the decision level:
where and , with , .
3. Uncertainty-Aware Hierarchical Gating for Safe AI Referral
Prediction confidence is managed through a two-stage rejection protocol:
- Stage 1: Physical Quality Control (QC): Laplacian variance is computed for all input images. Those with variance below are excluded as “too blurry.”
- Stage 2: Cognitive Reliability: Monte Carlo dropout () and test-time augmentation (horizontal/vertical flips) enable epistemic uncertainty estimation. stochastic passes generate probability samples . Mean prediction and uncertainty are calculated:
The process applies a threshold for decision gating:
A plausible implication is that uncertainty gating supports safe clinical integration through selective referral of ambiguous cases.
4. Explicit Fairness Constraint and Calibration
Fair-Eye Net incorporates fairness constraints to minimize missed diagnoses in disadvantaged subgroups, measured by false-negative rate (FNR) disparity across racial cohorts. The disparity metric is:
with over {White, Black, Asian}. Before calibration, ; after race-specific thresholding, disparity is reduced to $0.0328$, signifying a 73.4% reduction in gap while maintaining ROC-AUC ().
Unlike post hoc or loss function regularization approaches, Fair-Eye Net’s fairness optimization operates via primary calibration, directly equalizing FNR across subgroups. This design focuses on clinical reliability coterminous with fairness, advancing global eye health equity.
Fairness Metrics Summary Table
| Stage | White FNR | Black FNR | Asian FNR | |
|---|---|---|---|---|
| Global model | 0.123 | 0.254 | 0.320 | 0.197 |
| After calibration | 0.0328 | 0.262 | 0.272 | 0.239 |
5. Quantitative Evaluation and Robustness
Fair-Eye Net demonstrates strong screening and prognostic performance:
- High-confidence screening (top 30%): AUC , sensitivity , specificity , FNR .
- Ablation findings: Dropping test-time augmentation (TTA) lowers AUC to $0.902$ (narrows FNR to ); removing MC dropout increases FNR to .
- Static screening (full test set): ROC-AUC , specificity for top-confidence cases.
- Domain generalization: On unseen device brands (GRAPE dataset), coverage–accuracy curve is stable (–$0.89$).
- Early warning capability: Dynamic risk alerting forecasts glaucoma risk $3$–$12$ months in advance with sensitivity and specificity .
These metrics highlight both discrimination and stability across domains, supporting practical deployment for screening, risk stratification, and longitudinal monitoring.
6. Clinical Translation, Deployment, and Reproducibility
Fair-Eye Net uses standard deep learning architectures (ResNet-50, Dense MLP), conventional training protocols (AdamW optimizer, cross-entropy plus SmoothL1 losses), and publicly available datasets (SMDG-19, Harvard-GDP, GRAPE, Harvard-GF). The modular workflow—encompassing fusion, gating, and fairness calibration—facilitates adaptation to diverse clinical centers and imaging devices, supporting multi-center scalability.
High specificity reduces unnecessary referrals, while the uncertainty-aware gate ensures expert oversight of ambiguous cases. Subgroup fairness calibration operationalizes equity, directly addressing health disparities. A plausible implication is that this modular, reproducible design accelerates clinical translation and deployment toward population-scale impact in glaucoma prevention (Wei et al., 26 Jan 2026).