Papers
Topics
Authors
Recent
Search
2000 character limit reached

Radiomics AutoML Frameworks

Updated 20 January 2026
  • Radiomics-specific AutoML frameworks are specialized systems that automate the construction and optimization of machine learning pipelines tailored to radiomics data, reducing the need for manual feature engineering.
  • They employ modular architectures integrating preprocessing, feature extraction, selection, and advanced search strategies like Bayesian and evolutionary optimization to enhance model performance.
  • By embedding domain-specific imaging knowledge, these frameworks address reproducibility and heterogeneity challenges, enabling robust, clinically applicable model development.

Radiomics-specific automated machine learning (AutoML) frameworks constitute a class of software and algorithmic systems that automate the construction, selection, and optimization of machine learning pipelines tailored to radiomics data. In contrast to general-purpose AutoML platforms, these frameworks integrate medical imaging domain knowledge (feature extraction, normalization, reproducibility constraints, harmonization, segmentation, etc.), and provide specialized workflow modules to support radiomics-centric research—from image and region-of-interest (ROI) processing, feature extraction and selection, to model training, validation, and interpretability (Shafiee et al., 2015, Starmans et al., 2021, Lozano-Montoya et al., 13 Jan 2026, Tzanis et al., 30 Apr 2025). They address the unique methodological, computational, and reproducibility challenges of medical imaging analysis, particularly in heterogeneous, high-dimensional, and often small-sample-size settings.

1. Paradigm Shift: From Manual Feature Engineering to Automated Workflow Construction

Traditional radiomics studies rely on hand-crafted feature sets (intensity histograms, GLCM textures, wavelets, morphological/shape descriptors) and heuristic pipeline construction, requiring extensive manual design and domain expertise. Radiomics-specific AutoML frameworks automate this process by embedding feature engineering, algorithm selection, and hyperparameter optimization inside tunable workflow modules, formalizing the full model-building process as a combined algorithm selection and hyperparameter (CASH) optimization problem (Starmans et al., 2021).

Recent advances include deep discovery radiomics, where neural architectures (such as randomized CNNs or evolved sequencers) replace explicit, pre-defined feature families, learning data-driven latent representations from images without hand-specification of texture, shape, or intensity operators (Shafiee et al., 2015, Shafiee et al., 2017). This shift enables direct, end-to-end optimization and supports scalable, adaptive workflows across imaging modalities and clinical indications.

2. Modular Architectures and Formal Workflow Optimization

State-of-the-art radiomics-specific AutoML frameworks adopt a modular design. Each pipeline stage is defined as an independent, hyperparameterized module—commonly including preprocessing, feature extraction, feature selection, sample balancing, classifier selection, and ensembling. The optimal configuration is found by searching over both algorithm choices and their internal hyperparameters (Starmans et al., 2021).

Example: WORC Framework Modules

Pipeline Component Algorithmic Options / HPs Selection Mechanism
Image/ROI Preprocessing Intensity normalization, anisotropy mode Categorical/activator HP
Feature Extraction 564 features: shape, texture, wavelets Fixed + optional families
Feature Preprocessing Group-wise drop, imputation (mean, KNN), scaling Activator + selector HP
Feature Selection RELIEF, LASSO, RF, PCA, univariate tests Activator + selector HP
Resampling SMOTE, ADASYN, under/oversampling Activator + selector HP
Classification SVM, RF, LR, LDA, QDA, AdaBoost, XGBoost Selector HP

The pipeline search space is combinatorially large, and the workflow search is cast as:

λC=argminλCΔC1ktraini=1ktrainL(train=Dtrain(i)(λC),valid=Dvalid(i)(λC))\lambda^*_C = \arg\min_{\lambda_C\in\Delta_C} \frac{1}{k_{\rm train}} \sum_{i=1}^{k_{\rm train}} \mathcal{L}(\text{train}=D^{(i)}_{\rm train}(\lambda_C), \text{valid}=D^{(i)}_{\rm valid}(\lambda_C))

where λC\lambda_C encodes algorithm and hyperparameter configurations (Starmans et al., 2021).

Optimization is performed via random search, Bayesian optimization (SMAC-derived), or evolutionary strategies. Pipeline ensembling (e.g., Top-N, forward-selection) further stabilizes predictions (Starmans et al., 2021).

3. Integration of Radiomics-Specific Preprocessing and Feature Extraction

Radiomics-specific frameworks exceed standard tabular AutoML by integrating domain-driven feature extraction and preprocessing steps such as:

  • PyRadiomics-backed extraction of first-order, shape, and high-dimensional texture features (GLCM, GLRLM, GLSZM, GLDM, NGTDM, wavelet, LoG, LBP, vesselness, monogenic phases) (Chang et al., 2020, Tzanis et al., 30 Apr 2025, Starmans et al., 2021).
  • Automated image resampling, normalization, and discretization tailored to ROI masks and modality-specific constraints (Tzanis et al., 30 Apr 2025).
  • Filtering and harmonization modules to address intensity variation, batch effects (ComBat), and inter-center heterogeneity (although most frameworks not yet offering full harmonization automation) (Lozano-Montoya et al., 13 Jan 2026).

Deep-discovery radiomics frameworks, such as the StochasticNet and Evolutionary Deep Radiomic Sequencer (EDRS), use random-graph CNNs or evolutionary architecture search to learn sparse, compact feature compositions, circumventing hand-picking of feature types (Shafiee et al., 2015, Shafiee et al., 2017). For instance, StochasticNet radiomic sequencers generate receptive field masks via Bernoulli random sampling and learn over an ensemble of subnetworks sampled from the Gilbert random graph model, eliminating manual feature engineering (Shafiee et al., 2015).

4. AutoML Search Strategies and Model Selection

Workflow and hyperparameter optimization strategies include:

Model selection and validation are typically performed via nested or repeated cross-validation (e.g., stratified 5-fold or 10-fold CV), with internal and external test protocols (Starmans et al., 2021, Tzanis et al., 30 Apr 2025). Ensemble construction (Top-N, bagging, forward selection) consistently improves out-of-sample metrics (Starmans et al., 2021).

5. Notable Frameworks and Comparative Evaluation

Several radiomics-specific AutoML frameworks have been proposed and evaluated:

  • WORC: Modular workflow optimizer employing random/Bayesian search over pipeline components, validated on twelve clinical applications with competitive or superior AUC and F1 to baseline and clinical experts. Publicly available code and datasets (Starmans et al., 2021).
  • Simplatab: No-code GUI system supporting stability selection, recursive feature elimination, and basic bias/vulnerability analysis. Offered statistically superior mean AUC (81.81%) over general-purpose AutoML in an independent benchmark on ten datasets (Lozano-Montoya et al., 13 Jan 2026).
  • mAIstro: Multi-agent system integrating image segmentation (nnU-Net, TotalSegmentator), radiomics (PyRadiomics), and AutoML (PyCaret), coordinated via a natural language interface. Validated across 16 datasets; unique in agentic, NL-driven orchestration (Tzanis et al., 30 Apr 2025).
  • DARWIN: Web-based GUI allowing custom pipeline assembly and AutoML search in both classic radiomics (PyRadiomics) and deep learning stacks. Utilizes grid/random/Hyperband search, enables modular graph-based definition of experimental workflows (Chang et al., 2020).
  • AutoRadiomics, Auto-ML for Radiomics, AutoPrognosis: Older frameworks, some now obsolete, with limited or no radiomics-specific modules or infeasibility on contemporary high-dimensional datasets (Lozano-Montoya et al., 13 Jan 2026).

A selection of quantitative results from Simplatab (Lozano-Montoya et al., 13 Jan 2026):

Dataset AUC (%) ± SD Runtime (5-fold CV)
Desmoid 95.0 ± 3.8 ~1 h
Liver 96.4 ± 2.3 ~1 h
Lipo 87.7 ± 5.5 ~1 h
GIST 82.3 ± 4.4 ~1 h
Prostate 73.3 ± 5.8 ~1 h
Mean 81.81 ± 4.4 ~1 h

6. Challenges, Limitations, and Future Directions

Radiomics-specific AutoML frameworks remain an active research area with several outstanding challenges (Shafiee et al., 2015, Lozano-Montoya et al., 13 Jan 2026, Starmans et al., 2021, Tzanis et al., 30 Apr 2025):

  • Survival Analysis: Only a minority of frameworks (e.g., AutoPrognosis) support time-to-event modeling, and such modules are often computationally prohibitive when combined with high-dimensional radiomics.
  • Feature Reproducibility and Harmonization: Automated assessment of test–retest reliability, IBSI-compliance, and harmonization (e.g., ComBat, domain adaptation) remain largely absent, which impacts generalizability to multisite/multicenter data (Lozano-Montoya et al., 13 Jan 2026).
  • End-to-End Integration: Most solutions do not couple image preprocessing, segmentation, harmonization, feature extraction, and modeling in a single pipeline. Modular, federated learning-compatible designs are needed for robust multicenter studies.
  • Interpretability: While some frameworks employ SHAP or LIME for feature importance, deeper clinical explainability and meta-parameter transparency are limited (Lozano-Montoya et al., 13 Jan 2026, Chang et al., 2020, Tzanis et al., 30 Apr 2025).
  • Scalability: Deep-discovery radiomics and agentic orchestration systems (e.g., mAIstro) incur substantial compute costs for large 3D imaging or comprehensive AutoML searches (Tzanis et al., 30 Apr 2025).

Proposed future directions include embedding survival-analysis algorithms with efficient search, harmonization/reproducibility filtering, GUI-exposed meta-parameters for advanced users, and native support for federated learning (Lozano-Montoya et al., 13 Jan 2026, Shafiee et al., 2015, Tzanis et al., 30 Apr 2025).

7. Clinical Validation and Deployment Implications

Clinical validation across a spectrum of anatomical sites, imaging modalities, and endpoints has demonstrated the potential for radiomics-specific AutoML to outperform human experts and classic radiomics pipelines in several tasks (e.g., EDRS reporting 93.42% sensitivity and 88.78% accuracy for lung cancer detection, exceeding prior methods) (Shafiee et al., 2017, Starmans et al., 2021).

Frameworks such as EDRS are specifically designed for privacy-preserving, on-site deployment, enabling all computation at the institution level with compact, efficient models, and eliminating the need for PHI transfer to third-party servers (Shafiee et al., 2017). Modular open-source frameworks (WORC, mAIstro, DARWIN) promote reproducibility via audit-trails, configuration logging, and public datasets/code repositories (Starmans et al., 2021, Chang et al., 2020, Tzanis et al., 30 Apr 2025).

A plausible implication is that as reproducible workflow optimization, harmonization, and explainability modules mature within these frameworks, radiomics-specific AutoML will become the standard for robust, scalable, and interpretable quantitative imaging biomarker research in multi-center translational and clinical contexts.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Radiomics-Specific AutoML Frameworks.