Explainable AI for Parkinson’s disease prediction: A machine learning approach with interpretable models

IF 3 4区医学 Q2 MEDICINE, RESEARCH & EXPERIMENTAL

Current Research in Translational Medicine Pub Date : 2025-09-05 DOI:10.1016/j.retram.2025.103541

Adebimpe O. Esan , David B. Olawade , Afeez A. Soladoye , Bolaji A. Omodunbi , Ibrahim A. Adeyanju , Nicholas Aderinto

{"title":"Explainable AI for Parkinson’s disease prediction: A machine learning approach with interpretable models","authors":"Adebimpe O. Esan , David B. Olawade , Afeez A. Soladoye , Bolaji A. Omodunbi , Ibrahim A. Adeyanju , Nicholas Aderinto","doi":"10.1016/j.retram.2025.103541","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Parkinson’s Disease (PD) is a chronic, progressive neurological disorder with significant clinical and economic impacts globally. Early and accurate prediction remains challenging with traditional diagnostic methods due to subjectivity, delayed diagnosis, and variability. Machine Learning (ML) approaches offer potential solutions, yet their clinical adoption is hindered by limited interpretability. This study aimed to develop an interpretable ML model for early and accurate PD prediction using comprehensive multimodal datasets and Explainable Artificial Intelligence (XAI) techniques.</div></div><div><h3>Methods</h3><div>The study applied five ML algorithms: Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Logistic Regression (LR), Random Forest (RF), XGBoost, and a stacked ensemble method to a publicly available dataset (<em>n</em> = 2105) from Kaggle. Data encompassed demographic, medical history, lifestyle, clinical symptoms, cognitive, and functional assessments with specific inclusion/exclusion criteria applied. Preprocessing involved normalization, Synthetic Minority Oversampling Technique (SMOTE), and Sequential Backward Elimination (SBE) for feature selection. Model performance was evaluated via accuracy, precision, recall, F1-score, and Area Under Curve (AUC). The best-performing model (RF with feature selection) was interpreted using SHAP and LIME methods.</div></div><div><h3>Results</h3><div>Random Forest combined with Backward Elimination Feature Selection achieved the highest predictive accuracy (93 %), precision (93 %), recall (93 %), F1-score (93 %), and AUC (0.97). SHAP and LIME analyses indicated UPDRS scores, cognitive impairment, functional assessment, and motor symptoms as primary predictors, enhancing clinical interpretability.</div></div><div><h3>Conclusion</h3><div>The study demonstrated the effectiveness of an interpretable RF model for accurate PD prediction. Integration of ML and XAI significantly improves clinical decision-making, diagnosis timing, and personalized patient care.</div></div>","PeriodicalId":54260,"journal":{"name":"Current Research in Translational Medicine","volume":"73 4","pages":"Article 103541"},"PeriodicalIF":3.0000,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Research in Translational Medicine","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452318625000509","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Parkinson’s Disease (PD) is a chronic, progressive neurological disorder with significant clinical and economic impacts globally. Early and accurate prediction remains challenging with traditional diagnostic methods due to subjectivity, delayed diagnosis, and variability. Machine Learning (ML) approaches offer potential solutions, yet their clinical adoption is hindered by limited interpretability. This study aimed to develop an interpretable ML model for early and accurate PD prediction using comprehensive multimodal datasets and Explainable Artificial Intelligence (XAI) techniques.

Methods

The study applied five ML algorithms: Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Logistic Regression (LR), Random Forest (RF), XGBoost, and a stacked ensemble method to a publicly available dataset (n = 2105) from Kaggle. Data encompassed demographic, medical history, lifestyle, clinical symptoms, cognitive, and functional assessments with specific inclusion/exclusion criteria applied. Preprocessing involved normalization, Synthetic Minority Oversampling Technique (SMOTE), and Sequential Backward Elimination (SBE) for feature selection. Model performance was evaluated via accuracy, precision, recall, F1-score, and Area Under Curve (AUC). The best-performing model (RF with feature selection) was interpreted using SHAP and LIME methods.

Results

Random Forest combined with Backward Elimination Feature Selection achieved the highest predictive accuracy (93 %), precision (93 %), recall (93 %), F1-score (93 %), and AUC (0.97). SHAP and LIME analyses indicated UPDRS scores, cognitive impairment, functional assessment, and motor symptoms as primary predictors, enhancing clinical interpretability.

Conclusion

The study demonstrated the effectiveness of an interpretable RF model for accurate PD prediction. Integration of ML and XAI significantly improves clinical decision-making, diagnosis timing, and personalized patient care.

查看原文本刊更多论文

帕金森病预测的可解释人工智能：具有可解释模型的机器学习方法

帕金森病（PD）是一种慢性进行性神经系统疾病，在全球范围内具有重要的临床和经济影响。由于主观性、延迟诊断和可变性，传统诊断方法的早期和准确预测仍然具有挑战性。机器学习（ML）方法提供了潜在的解决方案，但其临床应用受到有限的可解释性的阻碍。本研究旨在利用综合多模态数据集和可解释人工智能（XAI）技术，开发一个可解释的ML模型，用于PD的早期和准确预测。方法采用支持向量机（SVM）、k近邻（KNN）、逻辑回归（LR）、随机森林（RF）、XGBoost和堆叠集成方法等5种机器学习算法对Kaggle公开数据集（n = 2105）进行分析。数据包括人口统计学、病史、生活方式、临床症状、认知和功能评估，并采用了特定的纳入/排除标准。预处理包括归一化、合成少数过采样技术（SMOTE）和序列反向消除（SBE）进行特征选择。通过准确性、精密度、召回率、f1评分和曲线下面积（AUC）来评估模型的性能。使用SHAP和LIME方法解释表现最佳的模型（带特征选择的RF）。结果随机森林结合后向消除特征选择的预测准确率最高（93%），准确率最高（93%），召回率最高（93%），f1得分最高（93%），AUC最高（0.97）。SHAP和LIME分析表明，UPDRS评分、认知障碍、功能评估和运动症状是主要预测因素，增强了临床可解释性。结论该研究证明了可解释的射频模型对PD的准确预测的有效性。ML和XAI的集成显著改善了临床决策、诊断时机和个性化患者护理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Current Research in Translational Medicine Biochemistry, Genetics and Molecular Biology-General Biochemistry,Genetics and Molecular Biology

CiteScore

7.00

自引率

4.90%

发文量

审稿时长

45 days

期刊介绍： Current Research in Translational Medicine is a peer-reviewed journal, publishing worldwide clinical and basic research in the field of hematology, immunology, infectiology, hematopoietic cell transplantation, and cellular and gene therapy. The journal considers for publication English-language editorials, original articles, reviews, and short reports including case-reports. Contributions are intended to draw attention to experimental medicine and translational research. Current Research in Translational Medicine periodically publishes thematic issues and is indexed in all major international databases (2017 Impact Factor is 1.9). Core areas covered in Current Research in Translational Medicine are: Hematology, Immunology, Infectiology, Hematopoietic, Cell Transplantation, Cellular and Gene Therapy.