WMH and ISL Segmentation Models

Updated 4 February 2026

The paper introduces a unified model combining CNN and transformer architectures for joint WMH/ISL segmentation with robust clinical metrics.
The approach employs lesion-centric patch sampling and modality-specific augmentation to address class imbalance and subtle lesion detection.
Quantitative evaluations using DSC, AP, and HD95 metrics demonstrate the model's superior performance across diverse MRI datasets.

White matter hyperintensities (WMH) and ischemic stroke lesions (ISL) are radiological entities visible on brain MRI and are critical imaging biomarkers of cerebral small vessel disease. Accurate segmentation of these lesions enables robust quantification for clinical, epidemiological, and research applications. Joint or differential WMH/ISL segmentation presents challenges due to their visual confounding on FLAIR MRI, mutual exclusivity, class imbalance, and the prevalence of partially labeled data. Technical advances from convolutional neural networks (CNNs) to hybrid transformer-attention architectures underpin the recent progress in developing unified models that address these challenges.

1. Problem Definition and Clinical Context

WMH are regions of increased signal intensity on T2-weighted and FLAIR MRIs, reflecting small vessel disease, aging, and neurodegeneration. ISLs manifest as focal hyperintensities and signify acute or chronic ischemic injury. Both can co-occur, overlap spatially, or mimic each other on FLAIR, yet distinction is clinically and biologically critical. Early works targeted only WMH, but Guerrero et al. (Guerrero et al., 2017) introduced explicit multiclass CNN-based segmentation to distinguish WMH from various stroke subtypes within a single model.

The diagnostic value of segmentation is further enhanced by volumetric concordance with visual scores (e.g., Fazekas) and consistency in epidemiologic associations (e.g., diabetes, atrophy), as demonstrated by uResNet-derived lesion masks (Guerrero et al., 2017). Accurate, automated, and generalizable segmentation frameworks thus serve as foundation for biomarker development and clinical studies.

2. Model Architectures for Joint WMH and ISL Segmentation

2.1 Fully-Convolutional Residual Networks

The uResNet model (Guerrero et al., 2017) is a fully-convolutional residual U-shaped network tailored for three-class segmentation (background, WMH, stroke). It employs 3×3 convolutions, ReLU activations, residual elements (linear skip + single conv branch), and trainable skip connections. Inputs are typically FLAIR patches (optionally multispectral with T₁ or probability maps), and only patches centered near lesions are sampled to combat severe class imbalance.

uResNet's output is voxel-wise softmax across the three classes. Its architectural simplicity—absence of batch normalization, ≈1 million parameters, and patch-based training—makes it efficient and robust in practice.

2.2 Unified Transformer-Attention Architectures

SYNAPSE-Net (Hassan et al., 30 Oct 2025) exemplifies modern unified WMH/ISL segmentation, integrating:

Multi-stream CNN encoders: Each modality (e.g., T1w, FLAIR, DWI, ADC) is processed through a dedicated encoder stream. Feature maps at equivalent resolutions are independently produced, concatenated, and channel-projected.
Convolutional Block Attention Module (CBAM): Channel- and spatial-wise attention refines the projected feature maps.
Swin Transformer Bottleneck: Deep features are reformulated as non-overlapping windows, passed through L layers of Swin Transformer for global context via local-window and shifted-window attention.
Dynamic Cross-Modal Attention Fusion (CMAF): Biologically motivated modality pairs (e.g., [T1w, FLAIR]) interact via bidirectional multi-head cross-attention, yielding contextually-enriched tokens for bottleneck fusion.
Hierarchical Lesion-Gated Decoder: Dense UNet++-style decoding, with spatial gates modulating skip connections at each hierarchical level, conditioned on lesion- or context-derived guidance.

This architecture supports input heterogeneity (N modalities), robust generalization across pathologies, and high-fidelity boundary reconstructions, facilitating application to both WMH and ISL without framework change (Hassan et al., 30 Oct 2025).

2.3 3D U-Net Variants and Partial Supervision

Recent large-scale multi-dataset studies (Phitidis et al., 28 Jan 2026) employ DynUNet, a 3D residual U-Net variant with deep supervision and instance normalization. Dynasty training leverages 3D brain-extracted FLAIR patches and rich augmentation pipelines (affine, rotation, Gaussian, artifact simulation). Output layers flexibly support joint multiclass, multi-head binary, or semi-supervised objectives.

3. Training Strategies in Partially Labeled and Multi-Source Regimes

3.1 Classical Lesion-Centric Sampling

To address the gross class imbalance intrinsic to WMH/ISL segmentation, lesion-centric patch sampling is employed (Guerrero et al., 2017). In uResNet, 20% of patches are centered on WMH regions, 80% on stroke regions, with random shifts to avoid positional bias. The majority of patch voxels remain background, but the sample distribution ensures sufficient lesion representation for effective learning.

3.2 Variance Reduction and Data Augmentation

SYNAPSE-Net utilizes a two-fold variance-reduction regime (Hassan et al., 30 Oct 2025):

Difficulty-aware sampling: Actively oversamples slices or patches where lesion volume falls below a percentile threshold, forcing the model to learn from subtle or small lesions.
Pathology-specific augmentations: Separate augmentation probability and magnitude for WMH (strongest; rotation ±20°, elastic deformation 50%) and ISL (medium; rotation ±15°, elastic deformation 30%), addressing differences in lesion morphology and scanner variability.

3.3 Partial Supervision and Pseudolabeling

Large-scale WMH/ISL segmentation often involves combining fully labeled, partially labeled (WMH-only or ISL-only), and unlabeled datasets. Six effective training paradigms have been benchmarked (Phitidis et al., 28 Jan 2026):

Multiclass baseline: Trained only on fully labeled data.
Multi-model binary: Separate WMH and ISL binary classifiers, probabilities fused for final segmentation.
Class-conditional (two-head): Shared encoder with task-specific decoders, selectively supervised per annotation availability.
Phased training: Foreground/background pretraining on all data, followed by multiclass fine-tuning on complete labels.
Class-adaptive and Marginal-Loss: For each sample, compute supervision only for present label classes, adapting the loss objective accordingly.
Pseudolabel approach: Missing class labels filled with argmax-predictions from a strong marginal-loss model, and model retrained on the combined ground-truth/pseudolabel dataset, achieving top performance.

The key algorithmic step for pseudolabeling is

$y_i^{\mathrm{pseudo}} = \arg\max_{c \in \{\mathrm{BG}, \mathrm{WMH}, \mathrm{ISL}\}} \;\hat y_i^{(c)},$

with subsequent retraining using the union of ground-truth and predicted labels (Phitidis et al., 28 Jan 2026).

4. Quantitative Evaluation and Comparative Analysis

4.1 Voxel-wise Metrics

Segmentation accuracy is typically evaluated with Dice similarity coefficient (DSC), absolute volume difference (AVD), Hausdorff distance (HD95), and Average Precision (AP, area under the precision-recall curve). Lesion-level precision, recall, and Distance Dice (2-mm tolerance) are common secondary metrics (Guerrero et al., 2017, Hassan et al., 30 Oct 2025, Phitidis et al., 28 Jan 2026).

Method/Model	WMH DSC (%)	ISL DSC (%)	WMH AP (%)	ISL AP (%)
SYNAPSE-Net (Hassan et al., 30 Oct 2025)	83.1	76.3	–	–
uResNet (Guerrero et al., 2017)	69.5	40.0	–	–
DynUNet-Pseudolabel (Phitidis et al., 28 Jan 2026)	67.2	47.0	76.0	55.2
Multi-model (Phitidis et al., 28 Jan 2026)	67.9	42.7	75.7	52.1

Notably, SYNAPSE-Net achieved state-of-the-art DSC and HD95 on both MICCAI 2017 WMH (DSC=0.831, HD95=3.03 mm) and ISLES 2022 ISL (DSC=0.7632, HD95=9.69 mm) (Hassan et al., 30 Oct 2025). The DynUNet pseudolabel model had the highest ISL AP (55.2%) and competitive WMH AP (76.0%) based on a held-out test set of >1000 scans (Phitidis et al., 28 Jan 2026).

4.2 Ablation and Robustness

Ablation studies with SYNAPSE-Net confirm that both cross-modal attention fusion (CMAF) and lesion-aware hierarchical gating are essential for low boundary error and improved performance, especially for small or low-contrast lesions (Hassan et al., 30 Oct 2025). Difficulty-aware sampling and pathology-specific augmentations further decrease test-time variance (e.g., σ_DSC ≈ 0.09 for WMH). In large multi-site datasets, class-adaptive and marginal-loss methods yield strong competitive results while being simpler to implement than pseudolabeling pipelines (Phitidis et al., 28 Jan 2026).

5. Clinical Validity and Biomarker Correlation

Model-derived WMH volumes correlate closely with visual rating scores (Fazekas) and established risk factors. For uResNet, Pearson correlation r = 0.824 (combined) for volume vs. Fazekas closely matches manual labeling (r = 0.819), and associations with basal ganglia perivascular space, deep atrophy, and diabetes are statistically significant (p < 0.05) (Guerrero et al., 2017). This demonstrates the biomarker utility and generalizability of automated lesion segmentation approaches.

6. Data, Preprocessing, and Generalization

WMH/ISL segmentation studies draw on both public challenges (MICCAI WMH, ISLES) and private multi-site cohorts, yielding large, heterogeneous datasets (Phitidis et al., 28 Jan 2026, Hassan et al., 30 Oct 2025). Preprocessing invariants include:

Coregistration (rigid FSL-FLIRT) to a standard space (FLAIR for WMH; DWI for ISL).
N4 bias-field correction.
Brain extraction (BET, SynthStrip).
Z-score normalization and cropping/padding to standardized sizes (e.g., 208×208 per slice, or 160×160×160 per patch).
Data harmonization via resampling to 1 mm³ isotropic resolution.

Consistency in these steps is vital to enabling cross-cohort generalization, as demonstrated by direct model transfer between WMH and ISL tasks in SYNAPSE-Net (Hassan et al., 30 Oct 2025).

7. Best Practices and Methodological Recommendations

Employ class-adaptive or marginal-loss objectives to maximally utilize partially annotated data in WMH/ISL segmentation (Phitidis et al., 28 Jan 2026).
Use pseudolabeling to fill in missing annotation channels for the highest ISL performance, if additional computation is feasible.
Model architectures with dedicated modality streams, global context modules (Swin Transformer), and attention-based fusion demonstrate superior results across tasks (Hassan et al., 30 Oct 2025).
Variance-reduction strategies, including difficulty-aware sampling and tailored augmentation, further stabilize performance, particularly on small or subtle lesions.
For practice, ensemble inference, test-time augmentation, and data harmonization are essential to optimize performance across datasets.

In conclusion, unified WMH and ISL segmentation models have advanced significantly through architectural innovations and training strategies for partial labels. Modern frameworks are now capable of robust, generalizable, and accurate lesion segmentation, supporting both clinical research and broad biomarker development (Hassan et al., 30 Oct 2025, Guerrero et al., 2017, Phitidis et al., 28 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (3)

White matter hyperintensity and stroke lesion segmentation and differentiation using convolutional neural networks (2017)

SYNAPSE-Net: A Unified Framework with Lesion-Aware Hierarchical Gating for Robust Segmentation of Heterogeneous Brain Lesions (2025)

Comparative evaluation of training strategies using partially labelled datasets for segmentation of white matter hyperintensities and stroke lesions in FLAIR MRI (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to WMH and ISL Segmentation Model.