{"title":"用机器学习预测帕金森病的进展:神经学的视角。","authors":"Aravalli Sainath Chaithanya, Nadipudi Kiran Kumar, Gugulothu Venkatesh Prasad, Bejawada Keerthana","doi":"10.4258/hir.2025.31.3.274","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>This study aims to predict the severity of Parkinson's disease (PD) by leveraging a comprehensive dataset integrating cerebrospinal fluid protein and peptide data sourced from UniProt, normalized protein expression metrics, clinical assessments, and gait data. The dataset comprised 248 PD patients monitored longitudinally, with periodic evaluations including 227 proteins, 971 peptides, gait parameters, and Movement Disorder Society-sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS-UPDRS) scores at baseline 0, 6, 12, and 24 months.</p><p><strong>Methods: </strong>A multifaceted machine learning framework was employed, consisting of random forest, TensorFlow decision forests, and a custom-developed phaseshift ensembling model. Additionally, regression techniques such as linear regression, random forest regressor, decision tree regressor, and K-nearest neighbors were utilized to support the predictions. These models aimed to forecast PD severity as reflected by UPDRS scores.</p><p><strong>Results: </strong>The custom phase-shift ensembling model demonstrated superior predictive performance, achieving an average symmetric mean absolute percentage error (sMAPE) of 55 across all UPDRS sections. Notably, the random forest regressor excelled in predicting motor function severity (UPDRS-III), attaining an sMAPE of 77.32, indicating its ability to model complex disease progression dynamics effectively.</p><p><strong>Conclusions: </strong>Integrating biological markers, clinical scores, and gait dynamics facilitates accurate modeling of PD progression. The ensemble-based approach, particularly phase-shift ensembling, improves prediction robustness and interpretability, offering a powerful strategy for the early prediction of PD severity. This study highlights the value of multi-source data fusion and advanced machine learning techniques in supporting early diagnosis and informed treatment planning for neurodegenerative diseases.</p>","PeriodicalId":12947,"journal":{"name":"Healthcare Informatics Research","volume":"31 3","pages":"274-283"},"PeriodicalIF":2.1000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12370421/pdf/","citationCount":"0","resultStr":"{\"title\":\"Advancements in Parkinson's Disease Prediction Using Machine Learning: A Neurological Perspective.\",\"authors\":\"Aravalli Sainath Chaithanya, Nadipudi Kiran Kumar, Gugulothu Venkatesh Prasad, Bejawada Keerthana\",\"doi\":\"10.4258/hir.2025.31.3.274\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>This study aims to predict the severity of Parkinson's disease (PD) by leveraging a comprehensive dataset integrating cerebrospinal fluid protein and peptide data sourced from UniProt, normalized protein expression metrics, clinical assessments, and gait data. The dataset comprised 248 PD patients monitored longitudinally, with periodic evaluations including 227 proteins, 971 peptides, gait parameters, and Movement Disorder Society-sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS-UPDRS) scores at baseline 0, 6, 12, and 24 months.</p><p><strong>Methods: </strong>A multifaceted machine learning framework was employed, consisting of random forest, TensorFlow decision forests, and a custom-developed phaseshift ensembling model. Additionally, regression techniques such as linear regression, random forest regressor, decision tree regressor, and K-nearest neighbors were utilized to support the predictions. These models aimed to forecast PD severity as reflected by UPDRS scores.</p><p><strong>Results: </strong>The custom phase-shift ensembling model demonstrated superior predictive performance, achieving an average symmetric mean absolute percentage error (sMAPE) of 55 across all UPDRS sections. Notably, the random forest regressor excelled in predicting motor function severity (UPDRS-III), attaining an sMAPE of 77.32, indicating its ability to model complex disease progression dynamics effectively.</p><p><strong>Conclusions: </strong>Integrating biological markers, clinical scores, and gait dynamics facilitates accurate modeling of PD progression. The ensemble-based approach, particularly phase-shift ensembling, improves prediction robustness and interpretability, offering a powerful strategy for the early prediction of PD severity. This study highlights the value of multi-source data fusion and advanced machine learning techniques in supporting early diagnosis and informed treatment planning for neurodegenerative diseases.</p>\",\"PeriodicalId\":12947,\"journal\":{\"name\":\"Healthcare Informatics Research\",\"volume\":\"31 3\",\"pages\":\"274-283\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12370421/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Healthcare Informatics Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4258/hir.2025.31.3.274\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/7/31 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Healthcare Informatics Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4258/hir.2025.31.3.274","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/31 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
Advancements in Parkinson's Disease Prediction Using Machine Learning: A Neurological Perspective.
Objectives: This study aims to predict the severity of Parkinson's disease (PD) by leveraging a comprehensive dataset integrating cerebrospinal fluid protein and peptide data sourced from UniProt, normalized protein expression metrics, clinical assessments, and gait data. The dataset comprised 248 PD patients monitored longitudinally, with periodic evaluations including 227 proteins, 971 peptides, gait parameters, and Movement Disorder Society-sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS-UPDRS) scores at baseline 0, 6, 12, and 24 months.
Methods: A multifaceted machine learning framework was employed, consisting of random forest, TensorFlow decision forests, and a custom-developed phaseshift ensembling model. Additionally, regression techniques such as linear regression, random forest regressor, decision tree regressor, and K-nearest neighbors were utilized to support the predictions. These models aimed to forecast PD severity as reflected by UPDRS scores.
Results: The custom phase-shift ensembling model demonstrated superior predictive performance, achieving an average symmetric mean absolute percentage error (sMAPE) of 55 across all UPDRS sections. Notably, the random forest regressor excelled in predicting motor function severity (UPDRS-III), attaining an sMAPE of 77.32, indicating its ability to model complex disease progression dynamics effectively.
Conclusions: Integrating biological markers, clinical scores, and gait dynamics facilitates accurate modeling of PD progression. The ensemble-based approach, particularly phase-shift ensembling, improves prediction robustness and interpretability, offering a powerful strategy for the early prediction of PD severity. This study highlights the value of multi-source data fusion and advanced machine learning techniques in supporting early diagnosis and informed treatment planning for neurodegenerative diseases.