用机器学习预测帕金森病的进展：神经学的视角。

IF 2.1 Q3 MEDICAL INFORMATICS

Healthcare Informatics Research Pub Date : 2025-07-01 Epub Date: 2025-07-31 DOI:10.4258/hir.2025.31.3.274

Aravalli Sainath Chaithanya, Nadipudi Kiran Kumar, Gugulothu Venkatesh Prasad, Bejawada Keerthana

{"title":"用机器学习预测帕金森病的进展：神经学的视角。","authors":"Aravalli Sainath Chaithanya, Nadipudi Kiran Kumar, Gugulothu Venkatesh Prasad, Bejawada Keerthana","doi":"10.4258/hir.2025.31.3.274","DOIUrl":null,"url":null,"abstract":"Objectives: This study aims to predict the severity of Parkinson's disease (PD) by leveraging a comprehensive dataset integrating cerebrospinal fluid protein and peptide data sourced from UniProt, normalized protein expression metrics, clinical assessments, and gait data. The dataset comprised 248 PD patients monitored longitudinally, with periodic evaluations including 227 proteins, 971 peptides, gait parameters, and Movement Disorder Society-sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS-UPDRS) scores at baseline 0, 6, 12, and 24 months.Methods: A multifaceted machine learning framework was employed, consisting of random forest, TensorFlow decision forests, and a custom-developed phaseshift ensembling model. Additionally, regression techniques such as linear regression, random forest regressor, decision tree regressor, and K-nearest neighbors were utilized to support the predictions. These models aimed to forecast PD severity as reflected by UPDRS scores.Results: The custom phase-shift ensembling model demonstrated superior predictive performance, achieving an average symmetric mean absolute percentage error (sMAPE) of 55 across all UPDRS sections. Notably, the random forest regressor excelled in predicting motor function severity (UPDRS-III), attaining an sMAPE of 77.32, indicating its ability to model complex disease progression dynamics effectively.Conclusions: Integrating biological markers, clinical scores, and gait dynamics facilitates accurate modeling of PD progression. The ensemble-based approach, particularly phase-shift ensembling, improves prediction robustness and interpretability, offering a powerful strategy for the early prediction of PD severity. This study highlights the value of multi-source data fusion and advanced machine learning techniques in supporting early diagnosis and informed treatment planning for neurodegenerative diseases.","PeriodicalId":12947,"journal":{"name":"Healthcare Informatics Research","volume":"31 3","pages":"274-283"},"PeriodicalIF":2.1000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12370421/pdf/","citationCount":"0","resultStr":"{\"title\":\"Advancements in Parkinson's Disease Prediction Using Machine Learning: A Neurological Perspective.\",\"authors\":\"Aravalli Sainath Chaithanya, Nadipudi Kiran Kumar, Gugulothu Venkatesh Prasad, Bejawada Keerthana\",\"doi\":\"10.4258/hir.2025.31.3.274\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objectives: This study aims to predict the severity of Parkinson's disease (PD) by leveraging a comprehensive dataset integrating cerebrospinal fluid protein and peptide data sourced from UniProt, normalized protein expression metrics, clinical assessments, and gait data. The dataset comprised 248 PD patients monitored longitudinally, with periodic evaluations including 227 proteins, 971 peptides, gait parameters, and Movement Disorder Society-sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS-UPDRS) scores at baseline 0, 6, 12, and 24 months.Methods: A multifaceted machine learning framework was employed, consisting of random forest, TensorFlow decision forests, and a custom-developed phaseshift ensembling model. Additionally, regression techniques such as linear regression, random forest regressor, decision tree regressor, and K-nearest neighbors were utilized to support the predictions. These models aimed to forecast PD severity as reflected by UPDRS scores.Results: The custom phase-shift ensembling model demonstrated superior predictive performance, achieving an average symmetric mean absolute percentage error (sMAPE) of 55 across all UPDRS sections. Notably, the random forest regressor excelled in predicting motor function severity (UPDRS-III), attaining an sMAPE of 77.32, indicating its ability to model complex disease progression dynamics effectively.Conclusions: Integrating biological markers, clinical scores, and gait dynamics facilitates accurate modeling of PD progression. The ensemble-based approach, particularly phase-shift ensembling, improves prediction robustness and interpretability, offering a powerful strategy for the early prediction of PD severity. This study highlights the value of multi-source data fusion and advanced machine learning techniques in supporting early diagnosis and informed treatment planning for neurodegenerative diseases.\",\"PeriodicalId\":12947,\"journal\":{\"name\":\"Healthcare Informatics Research\",\"volume\":\"31 3\",\"pages\":\"274-283\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12370421/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Healthcare Informatics Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4258/hir.2025.31.3.274\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/7/31 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Healthcare Informatics Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4258/hir.2025.31.3.274","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/31 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

摘要

目的：本研究旨在通过综合来自UniProt的脑脊液蛋白和肽数据、标准化蛋白表达指标、临床评估和步态数据的综合数据集来预测帕金森病（PD）的严重程度。该数据集包括248名PD患者的纵向监测，定期评估包括227种蛋白质、971种肽、步态参数和运动障碍协会赞助的统一帕金森病评定量表（MDS-UPDRS）评分，基线为0、6、12和24个月。方法：采用多层面的机器学习框架，包括随机森林、TensorFlow决策森林和定制开发的移相集成模型。此外，还利用线性回归、随机森林回归、决策树回归和k近邻回归等回归技术来支持预测。这些模型旨在通过UPDRS评分来预测PD的严重程度。结果：自定义相移集成模型显示出优越的预测性能，在所有UPDRS部分中实现了平均对称平均绝对百分比误差（sMAPE）为55。值得注意的是，随机森林回归器在预测运动功能严重程度（UPDRS-III）方面表现出色，达到77.32的sMAPE，表明其能够有效地模拟复杂的疾病进展动态。结论：整合生物标志物、临床评分和步态动力学有助于PD进展的准确建模。基于集成的方法，特别是相移集成，提高了预测的鲁棒性和可解释性，为PD严重程度的早期预测提供了强有力的策略。本研究强调了多源数据融合和先进的机器学习技术在支持神经退行性疾病的早期诊断和知情治疗计划方面的价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Advancements in Parkinson's Disease Prediction Using Machine Learning: A Neurological Perspective.

查看原文本刊更多论文

Advancements in Parkinson's Disease Prediction Using Machine Learning: A Neurological Perspective.

Objectives: This study aims to predict the severity of Parkinson's disease (PD) by leveraging a comprehensive dataset integrating cerebrospinal fluid protein and peptide data sourced from UniProt, normalized protein expression metrics, clinical assessments, and gait data. The dataset comprised 248 PD patients monitored longitudinally, with periodic evaluations including 227 proteins, 971 peptides, gait parameters, and Movement Disorder Society-sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS-UPDRS) scores at baseline 0, 6, 12, and 24 months.

Methods: A multifaceted machine learning framework was employed, consisting of random forest, TensorFlow decision forests, and a custom-developed phaseshift ensembling model. Additionally, regression techniques such as linear regression, random forest regressor, decision tree regressor, and K-nearest neighbors were utilized to support the predictions. These models aimed to forecast PD severity as reflected by UPDRS scores.

Results: The custom phase-shift ensembling model demonstrated superior predictive performance, achieving an average symmetric mean absolute percentage error (sMAPE) of 55 across all UPDRS sections. Notably, the random forest regressor excelled in predicting motor function severity (UPDRS-III), attaining an sMAPE of 77.32, indicating its ability to model complex disease progression dynamics effectively.

Conclusions: Integrating biological markers, clinical scores, and gait dynamics facilitates accurate modeling of PD progression. The ensemble-based approach, particularly phase-shift ensembling, improves prediction robustness and interpretability, offering a powerful strategy for the early prediction of PD severity. This study highlights the value of multi-source data fusion and advanced machine learning techniques in supporting early diagnosis and informed treatment planning for neurodegenerative diseases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Healthcare Informatics Research MEDICAL INFORMATICS-

CiteScore

4.90

自引率

6.90%

发文量