SCOPE-PD: Explainable AI on Subjective and Clinical Objective Measurements of Parkinson's Disease for Precision Decision-Making

Published 30 Jan 2026 in cs.LG and cs.AI | (2601.22516v1)

Abstract: Parkinson's disease (PD) is a chronic and complex neurodegenerative disorder influenced by genetic, clinical, and lifestyle factors. Predicting this disease early is challenging because it depends on traditional diagnostic methods that face issues of subjectivity, which commonly delay diagnosis. Several objective analyses are currently in practice to help overcome the challenges of subjectivity; however, a proper explanation of these analyses is still lacking. While ML has demonstrated potential in supporting PD diagnosis, existing approaches often rely on subjective reports only and lack interpretability for individualized risk estimation. This study proposes SCOPE-PD, an explainable AI-based prediction framework, by integrating subjective and objective assessments to provide personalized health decisions. Subjective and objective clinical assessment data are collected from the Parkinson's Progression Markers Initiative (PPMI) study to construct a multimodal prediction framework. Several ML techniques are applied to these data, and the best ML model is selected to interpret the results. Model interpretability is examined using SHAP-based analysis. The Random Forest algorithm achieves the highest accuracy of 98.66 percent using combined features from both subjective and objective test data. Tremor, bradykinesia, and facial expression are identified as the top three contributing features from the MDS-UPDRS test in the prediction of PD.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper introduces SCOPE-PD, an explainable ML framework that integrates subjective and objective clinical measurements for early PD diagnosis.
It employs a Random Forest model with SHAP explainability achieving over 98% accuracy, validated on a robust PPMI cohort.
The study emphasizes actionable, patient-specific insights to support precision clinical decision-making and future research.

SCOPE-PD: Explainable ML for Precision Parkinson’s Disease Diagnosis

Introduction

The paper “SCOPE-PD: Explainable AI on Subjective and Clinical Objective Measurements of Parkinson's Disease for Precision Decision-Making” (2601.22516) presents a machine learning workflow integrating both subjective (patient-reported) and objective (expert-assessed) clinical assessment data to enhance early Parkinson's disease (PD) identification. The framework emphasizes individualized, explainable predictions with direct relevance to clinical decision-making. Recognizing the challenge that most prior work has focused on either subjective data or black-box models, this study tightly couples high predictive performance with XAI-enabled interpretability, using the SHAP methodology to foster clinical trust, translational transparency, and actionable insights at both population and individual levels.

Dataset and Preprocessing

The foundation is the baseline dataset from the PPMI (Parkinson’s Progression Markers Initiative), leveraging 148 features (79 subjective, 69 objective) from 1,786 subjects. Subjective measures derive from instruments such as MDS-UPDRS I–II, GDS, SCOPA-AUT, and others. Objective measures include MDS-UPDRS III, MoCA, BJLO, HVLT, and additional established neuropsychological tests. Rigorous preprocessing included elimination of data with excessive missingness, imputation-free feature set construction, alignment of scoring directionality, and normalization by min-max scaling to [0,1] per feature.

Three datasets were constructed: subjective only, objective only, and combined, enabling robust model comparison along clinically meaningful data type axes.

Learning and Model Selection

Five supervised algorithms were benchmarked: logistic regression, SVM (RBF), KNN, Random Forest (RF), and XGBoost. Class imbalance was handled via class-wise weighting or, for XGBoost, scale_pos_weight. Model selection and hyperparameter tuning used nested 5-fold stratified cross-validation with an 80/20 train/test split, optimizing for F1 score. All records with missing data were excluded to maintain integrity and comparability.

The highest results were achieved with the Random Forest model on the combined feature set:

Accuracy: 98.66% (±0.91%)
F1-score: 0.9917 (±0.0055)
ROC-AUC: 0.9992 (±0.0006)
PR-AUC: 0.9998 (±0.0001)

When using only subjective or objective features, the RF model still yielded notable, and nearly equivalent, performance (96.55–97.85% accuracy). These metrics were generated using strict cross-validation without synthetic data augmentation, reducing risk of optimistic bias.

Explainability and Feature Attribution

Interpretability was delivered through SHAP TreeExplainer for tree ensemble models. Both local (individual-level) and global (cohort-level) feature attributions were calculated, quantifying the additive effect of each feature on PD prediction probability. The SHAP paradigm yields precise, cohort- and subject-specific statements such as “self-reported tremor increases PD probability by +0.05”, allowing nuanced clinician-facing decision support.

Globally, the top discriminative features were:

Combined dataset: Tremor (NP2TRMR; self-reported), bradykinesia (NP3BRADY; observed), facial expression (NP3FACXP; observed) dominated importance.
Subjective dataset: Nine of ten top features were from MDS-UPDRS II (activities of daily living); the most significant was self-reported tremor.
Objective dataset: MDS-UPDRS III features were exclusively top-ranked, with emphasis on bradykinesia-related markers.

SHAP analysis forcibly confirmed that established PD biomarkers, especially those relating to tremor and bradykinetic impairment, retain overwhelming predictive value even alongside complex multimodal data. The framework allows quantification of local feature contributions, thus facilitating individualized risk stratification.

Results in the Context of Prior Work

The reported accuracy (98.66% RF, combined) surpasses previously published results that employed single-modality datasets (e.g., sensor or voice only [81.7–92.6%]) and matches/exceeds prior explainable multimodal approaches (DenseNet + clinical data: 96.6–96.8% accuracy [see Dentamaro et al., Priyadharshini et al.]). The strong performance is attributed not only to multimodal integration, but to careful curation and preprocessing of high-fidelity, harmonized baseline assessments, and, crucially, robust XAI post hoc evaluation.

Implications and Future Directions

Clinical and Research Implications

Precision Clinical Support: The methodology enables patient-specific explanations supporting risk communication, intervention targeting, and differential diagnostics at early disease stages, all within the bounds of interpretable AI.
Feature-Level Insights: Quantitative feature attributions aid in dissecting the contributions of subjective and objective components, fostering understanding of “what drives” diagnosis in each patient, supporting further subclassification and treatment stratification.
Validation and Generalizability: The study recognizes the need for external validation on independent multi-site cohorts, as models tuned exclusively on PPMI risk overfitting to site-specific or population-level idiosyncrasies.

Theoretical Advances and Future AI Development

XAI-Driven Translational ML: This work pushes beyond simple model performance metrics, requiring that high-accuracy AI be accompanied by transparent, quantitative feature attribution frameworks. It demonstrates a practical pipeline for integrating tabular multimodal clinical data with XAI in precision medicine.
Multimodal Expansion: The authors propose subsequent extensions to include genetic data (e.g., GBA, SNCA), neuroimaging (MRI/fMRI), and longitudinal/progressional modeling. This trajectory will support granular endophenotype discovery and improved prognostication.
Benchmarking for Trustworthy AI: The approach highlights the criticality of nested validation, robust handling of imbalance, and feature alignment in medical ML, setting a methodological precedent for future disease-oriented decision support systems.

Conclusion

This paper develops a unified, explainable AI framework—SCOPE-PD—for early, accurate PD diagnosis from curated subjective and objective clinical data. By employing both high-performing ensemble models and rigorous SHAP-based interpretability analyses, the study delivers accuracies above 98% with actionable, patient-specific explanations, asserting that multimodal, explainable ML models can make PD risk assessment both precise and clinically interpretable. The results call for continued research on generalizability, broader phenotype integration, and real-world clinical deployment, providing a blue-print for XAI in neurodegenerative disease management (2601.22516).