利用机器学习和模型不可知的解释来理解心血管疾病的自动诊断

Christopher Sun, J. Sharma, Milind Maiti
{"title":"利用机器学习和模型不可知的解释来理解心血管疾病的自动诊断","authors":"Christopher Sun, J. Sharma, Milind Maiti","doi":"10.1109/IBIOMED56408.2022.9988121","DOIUrl":null,"url":null,"abstract":"The pervasiveness of cardiovascular disease and physician misdiagnosis creates the need for artificial intelligence models to improve diagnosis accuracy. The study trains machine learning models on publicly available data sets containing simple medical information of patients to diagnose cardiovascular disease. The Multilayer Perceptron (MLP) assembled for this task performed optimally with an F1 score of 0.8968. This prompts the creation of an automated open-source diagnosis tool powered by the MLP. Local Interpretable Model-Agnostic Explanations (LIME) are employed to understand the impact of different features on the model's diagnosis in the form of marginal probabilities. K-Means Clustering segments patients into ten clusters, after which each example is passed through LIME. The resulting histograms depict a complex relationship between feature, cluster, and impact on diagnosis. A series of P-values with contrasting orders of magnitude shows nuances in the MLP's understanding of patients from different clusters. LIME analysis reveals that the most important features for cardiovascular disease diagnosis are fasting blood sugar, type of chest pain, and ST segment slope. Future experiments should replicate this study's LIME methodology on data sets containing more specialized features in order to gain practical medical insights about the different types of cardiovascular disease represented by each cluster. Finally, feature engineering pathways should be explored with consideration of these results to create versatile diagnosis models adaptable to other diseases as well.","PeriodicalId":250112,"journal":{"name":"2022 4th International Conference on Biomedical Engineering (IBIOMED)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Leveraging Machine Learning and Model-Agnostic Explanations to Understand Automated Diagnosis of Cardiovascular Disease\",\"authors\":\"Christopher Sun, J. Sharma, Milind Maiti\",\"doi\":\"10.1109/IBIOMED56408.2022.9988121\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The pervasiveness of cardiovascular disease and physician misdiagnosis creates the need for artificial intelligence models to improve diagnosis accuracy. The study trains machine learning models on publicly available data sets containing simple medical information of patients to diagnose cardiovascular disease. The Multilayer Perceptron (MLP) assembled for this task performed optimally with an F1 score of 0.8968. This prompts the creation of an automated open-source diagnosis tool powered by the MLP. Local Interpretable Model-Agnostic Explanations (LIME) are employed to understand the impact of different features on the model's diagnosis in the form of marginal probabilities. K-Means Clustering segments patients into ten clusters, after which each example is passed through LIME. The resulting histograms depict a complex relationship between feature, cluster, and impact on diagnosis. A series of P-values with contrasting orders of magnitude shows nuances in the MLP's understanding of patients from different clusters. LIME analysis reveals that the most important features for cardiovascular disease diagnosis are fasting blood sugar, type of chest pain, and ST segment slope. Future experiments should replicate this study's LIME methodology on data sets containing more specialized features in order to gain practical medical insights about the different types of cardiovascular disease represented by each cluster. Finally, feature engineering pathways should be explored with consideration of these results to create versatile diagnosis models adaptable to other diseases as well.\",\"PeriodicalId\":250112,\"journal\":{\"name\":\"2022 4th International Conference on Biomedical Engineering (IBIOMED)\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 4th International Conference on Biomedical Engineering (IBIOMED)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IBIOMED56408.2022.9988121\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 4th International Conference on Biomedical Engineering (IBIOMED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IBIOMED56408.2022.9988121","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

心血管疾病和医生误诊的普遍存在,需要人工智能模型来提高诊断准确性。该研究在包含患者简单医疗信息的公开数据集上训练机器学习模型,以诊断心血管疾病。为此任务组装的多层感知器(MLP)的F1得分为0.8968,表现最佳。这提示创建一个由MLP提供支持的自动化开源诊断工具。采用局部可解释模型不可知论解释(LIME),以边际概率的形式理解不同特征对模型诊断的影响。K-Means聚类将患者分成10个聚类,之后每个样本通过LIME。所得到的直方图描绘了特征、聚类和对诊断的影响之间的复杂关系。一系列具有不同数量级的p值显示了MLP对来自不同集群的患者的理解的细微差别。LIME分析显示,空腹血糖、胸痛类型和ST段斜率是诊断心血管疾病最重要的特征。未来的实验应该在包含更多专业特征的数据集上复制本研究的LIME方法,以便获得关于每个聚类所代表的不同类型心血管疾病的实际医学见解。最后,应结合这些结果探索特征工程路径,以创建适应其他疾病的通用诊断模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Leveraging Machine Learning and Model-Agnostic Explanations to Understand Automated Diagnosis of Cardiovascular Disease
The pervasiveness of cardiovascular disease and physician misdiagnosis creates the need for artificial intelligence models to improve diagnosis accuracy. The study trains machine learning models on publicly available data sets containing simple medical information of patients to diagnose cardiovascular disease. The Multilayer Perceptron (MLP) assembled for this task performed optimally with an F1 score of 0.8968. This prompts the creation of an automated open-source diagnosis tool powered by the MLP. Local Interpretable Model-Agnostic Explanations (LIME) are employed to understand the impact of different features on the model's diagnosis in the form of marginal probabilities. K-Means Clustering segments patients into ten clusters, after which each example is passed through LIME. The resulting histograms depict a complex relationship between feature, cluster, and impact on diagnosis. A series of P-values with contrasting orders of magnitude shows nuances in the MLP's understanding of patients from different clusters. LIME analysis reveals that the most important features for cardiovascular disease diagnosis are fasting blood sugar, type of chest pain, and ST segment slope. Future experiments should replicate this study's LIME methodology on data sets containing more specialized features in order to gain practical medical insights about the different types of cardiovascular disease represented by each cluster. Finally, feature engineering pathways should be explored with consideration of these results to create versatile diagnosis models adaptable to other diseases as well.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信