用于预测多发性硬化症从临床孤立综合征转归的可解释机器学习

Eden Caroline Daniel, SANTOSH TIRUNAGARI, Karan Batth, David Windridge, Yashaswini Balla
{"title":"用于预测多发性硬化症从临床孤立综合征转归的可解释机器学习","authors":"Eden Caroline Daniel, SANTOSH TIRUNAGARI, Karan Batth, David Windridge, Yashaswini Balla","doi":"10.1101/2024.07.18.24310578","DOIUrl":null,"url":null,"abstract":"Background: Machine learning (ML) prediction of clinically isolated syndrome (CIS) conversion to multiple sclerosis (MS) could be used as a remote, preliminary tool by clinicians to identify high-risk patients that would benefit from early treatment. Objective: This study evaluates ML models to predict CIS to MS conversion and identifies key predictors. Methods: Five supervised learning techniques (Naive Bayes, Logistic Regression, Decision Trees, Random Forests and Support Vector Machines) were applied to clinical data from 138 Lithuanian and 273 Mexican CIS patients. Seven different feature combinations were evaluated to determine the most effective models and predictors. Results: Key predictors common to both datasets included sex, presence of oligoclonal bands in CSF, MRI spinal lesions, abnormal visual evoked potentials and brainstem auditory evoked potentials. The Lithuanian dataset confirmed predictors identified by previous clinical research, while the Mexican dataset partially validated them. The highest F1 score of 1.0 was achieved using Random Forests on all features for the Mexican dataset and Logistic Regression with SMOTE Upsampling on all features for the Lithuanian dataset. Conclusion: Applying the identified high-performing ML models to the CIS patient datasets shows potential in assisting clinicians to identify high-risk patients.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Interpretable Machine Learning for Predicting Multiple Sclerosis Conversion from Clinically Isolated Syndrome\",\"authors\":\"Eden Caroline Daniel, SANTOSH TIRUNAGARI, Karan Batth, David Windridge, Yashaswini Balla\",\"doi\":\"10.1101/2024.07.18.24310578\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Machine learning (ML) prediction of clinically isolated syndrome (CIS) conversion to multiple sclerosis (MS) could be used as a remote, preliminary tool by clinicians to identify high-risk patients that would benefit from early treatment. Objective: This study evaluates ML models to predict CIS to MS conversion and identifies key predictors. Methods: Five supervised learning techniques (Naive Bayes, Logistic Regression, Decision Trees, Random Forests and Support Vector Machines) were applied to clinical data from 138 Lithuanian and 273 Mexican CIS patients. Seven different feature combinations were evaluated to determine the most effective models and predictors. Results: Key predictors common to both datasets included sex, presence of oligoclonal bands in CSF, MRI spinal lesions, abnormal visual evoked potentials and brainstem auditory evoked potentials. The Lithuanian dataset confirmed predictors identified by previous clinical research, while the Mexican dataset partially validated them. The highest F1 score of 1.0 was achieved using Random Forests on all features for the Mexican dataset and Logistic Regression with SMOTE Upsampling on all features for the Lithuanian dataset. Conclusion: Applying the identified high-performing ML models to the CIS patient datasets shows potential in assisting clinicians to identify high-risk patients.\",\"PeriodicalId\":501454,\"journal\":{\"name\":\"medRxiv - Health Informatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv - Health Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.07.18.24310578\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.07.18.24310578","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景:临床孤立综合征(CIS)转化为多发性硬化症(MS)的机器学习(ML)预测可作为一种远程初步工具,供临床医生用于识别可从早期治疗中获益的高风险患者。研究目的本研究评估了预测 CIS 向 MS 转化的 ML 模型,并确定了关键预测因子。方法:将五种监督学习技术(Naive Bayes、逻辑回归、决策树、随机森林和支持向量机)应用于 138 名立陶宛和 273 名墨西哥 CIS 患者的临床数据。对七种不同的特征组合进行了评估,以确定最有效的模型和预测因子。结果:两个数据集共同的关键预测因素包括性别、CSF 中是否存在寡克隆带、MRI 脊柱病变、异常视觉诱发电位和脑干听觉诱发电位。立陶宛数据集证实了之前临床研究确定的预测因子,而墨西哥数据集则部分验证了这些预测因子。在墨西哥数据集的所有特征上使用随机森林,在立陶宛数据集的所有特征上使用逻辑回归和 SMOTE 提升采样,均获得了 1.0 的最高 F1 分数。结论将已确定的高性能 ML 模型应用于 CIS 患者数据集显示出了帮助临床医生识别高风险患者的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Interpretable Machine Learning for Predicting Multiple Sclerosis Conversion from Clinically Isolated Syndrome
Background: Machine learning (ML) prediction of clinically isolated syndrome (CIS) conversion to multiple sclerosis (MS) could be used as a remote, preliminary tool by clinicians to identify high-risk patients that would benefit from early treatment. Objective: This study evaluates ML models to predict CIS to MS conversion and identifies key predictors. Methods: Five supervised learning techniques (Naive Bayes, Logistic Regression, Decision Trees, Random Forests and Support Vector Machines) were applied to clinical data from 138 Lithuanian and 273 Mexican CIS patients. Seven different feature combinations were evaluated to determine the most effective models and predictors. Results: Key predictors common to both datasets included sex, presence of oligoclonal bands in CSF, MRI spinal lesions, abnormal visual evoked potentials and brainstem auditory evoked potentials. The Lithuanian dataset confirmed predictors identified by previous clinical research, while the Mexican dataset partially validated them. The highest F1 score of 1.0 was achieved using Random Forests on all features for the Mexican dataset and Logistic Regression with SMOTE Upsampling on all features for the Lithuanian dataset. Conclusion: Applying the identified high-performing ML models to the CIS patient datasets shows potential in assisting clinicians to identify high-risk patients.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信