Adebimpe O. Esan , David B. Olawade , Afeez A. Soladoye , Bolaji A. Omodunbi , Ibrahim A. Adeyanju , Nicholas Aderinto
{"title":"Explainable AI for Parkinson’s disease prediction: A machine learning approach with interpretable models","authors":"Adebimpe O. Esan , David B. Olawade , Afeez A. Soladoye , Bolaji A. Omodunbi , Ibrahim A. Adeyanju , Nicholas Aderinto","doi":"10.1016/j.retram.2025.103541","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Parkinson’s Disease (PD) is a chronic, progressive neurological disorder with significant clinical and economic impacts globally. Early and accurate prediction remains challenging with traditional diagnostic methods due to subjectivity, delayed diagnosis, and variability. Machine Learning (ML) approaches offer potential solutions, yet their clinical adoption is hindered by limited interpretability. This study aimed to develop an interpretable ML model for early and accurate PD prediction using comprehensive multimodal datasets and Explainable Artificial Intelligence (XAI) techniques.</div></div><div><h3>Methods</h3><div>The study applied five ML algorithms: Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Logistic Regression (LR), Random Forest (RF), XGBoost, and a stacked ensemble method to a publicly available dataset (<em>n</em> = 2105) from Kaggle. Data encompassed demographic, medical history, lifestyle, clinical symptoms, cognitive, and functional assessments with specific inclusion/exclusion criteria applied. Preprocessing involved normalization, Synthetic Minority Oversampling Technique (SMOTE), and Sequential Backward Elimination (SBE) for feature selection. Model performance was evaluated via accuracy, precision, recall, F1-score, and Area Under Curve (AUC). The best-performing model (RF with feature selection) was interpreted using SHAP and LIME methods.</div></div><div><h3>Results</h3><div>Random Forest combined with Backward Elimination Feature Selection achieved the highest predictive accuracy (93 %), precision (93 %), recall (93 %), F1-score (93 %), and AUC (0.97). SHAP and LIME analyses indicated UPDRS scores, cognitive impairment, functional assessment, and motor symptoms as primary predictors, enhancing clinical interpretability.</div></div><div><h3>Conclusion</h3><div>The study demonstrated the effectiveness of an interpretable RF model for accurate PD prediction. Integration of ML and XAI significantly improves clinical decision-making, diagnosis timing, and personalized patient care.</div></div>","PeriodicalId":54260,"journal":{"name":"Current Research in Translational Medicine","volume":"73 4","pages":"Article 103541"},"PeriodicalIF":3.0000,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Research in Translational Medicine","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452318625000509","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Parkinson’s Disease (PD) is a chronic, progressive neurological disorder with significant clinical and economic impacts globally. Early and accurate prediction remains challenging with traditional diagnostic methods due to subjectivity, delayed diagnosis, and variability. Machine Learning (ML) approaches offer potential solutions, yet their clinical adoption is hindered by limited interpretability. This study aimed to develop an interpretable ML model for early and accurate PD prediction using comprehensive multimodal datasets and Explainable Artificial Intelligence (XAI) techniques.
Methods
The study applied five ML algorithms: Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Logistic Regression (LR), Random Forest (RF), XGBoost, and a stacked ensemble method to a publicly available dataset (n = 2105) from Kaggle. Data encompassed demographic, medical history, lifestyle, clinical symptoms, cognitive, and functional assessments with specific inclusion/exclusion criteria applied. Preprocessing involved normalization, Synthetic Minority Oversampling Technique (SMOTE), and Sequential Backward Elimination (SBE) for feature selection. Model performance was evaluated via accuracy, precision, recall, F1-score, and Area Under Curve (AUC). The best-performing model (RF with feature selection) was interpreted using SHAP and LIME methods.
Results
Random Forest combined with Backward Elimination Feature Selection achieved the highest predictive accuracy (93 %), precision (93 %), recall (93 %), F1-score (93 %), and AUC (0.97). SHAP and LIME analyses indicated UPDRS scores, cognitive impairment, functional assessment, and motor symptoms as primary predictors, enhancing clinical interpretability.
Conclusion
The study demonstrated the effectiveness of an interpretable RF model for accurate PD prediction. Integration of ML and XAI significantly improves clinical decision-making, diagnosis timing, and personalized patient care.
期刊介绍:
Current Research in Translational Medicine is a peer-reviewed journal, publishing worldwide clinical and basic research in the field of hematology, immunology, infectiology, hematopoietic cell transplantation, and cellular and gene therapy. The journal considers for publication English-language editorials, original articles, reviews, and short reports including case-reports. Contributions are intended to draw attention to experimental medicine and translational research. Current Research in Translational Medicine periodically publishes thematic issues and is indexed in all major international databases (2017 Impact Factor is 1.9).
Core areas covered in Current Research in Translational Medicine are:
Hematology,
Immunology,
Infectiology,
Hematopoietic,
Cell Transplantation,
Cellular and Gene Therapy.