基于集成学习的糖尿病预测模型

Lei Qin
{"title":"基于集成学习的糖尿病预测模型","authors":"Lei Qin","doi":"10.1145/3573942.3573949","DOIUrl":null,"url":null,"abstract":"Abstract: Diabetes is a common disease that seriously endangers human health, mostly in the middle-aged and the elderly. Predicting the incidence rate of diabetes enables doctors to make a scientific treatment plan in advance, which will significantly improve the cure rate and reduce the incidence rate. Based on this situation, this paper proposes a diabetes prediction model based on ensemble learning, which integrates some classical machine learning algorithms, including Logisticregression, Kneigbors, Decisiontree, GaussianNB, and support vector machine (SVM) The first four low correlation algorithms are constructed as basic learners, and then integrated into meta learner SVM to build an integrated learning model. The advantages of the comprehensive model are evaluated from the following aspects: accuracy, precision, recall rate, AUC, and other evaluation indicators. The experiment was carried out on the Pima Indian diabetes data set (PIDD) published by UCI. First, the XGboost algorithm was used to select the optimal features, and then an integrated learning model was constructed to predict. The experimental results show that the accuracy rate of the integrated learning model is 81.63%, the precision rate is 80%, the recall rate is 80%, and the AUC is 84%. The advantages of the model in accuracy, precision, recall, and AUC are verified. The model will effectively help doctors make more accurate diagnoses and predictions of patients' physical conditions and implement more scientific treatment.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Prediction Model of Diabetes Based on Ensemble Learning\",\"authors\":\"Lei Qin\",\"doi\":\"10.1145/3573942.3573949\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract: Diabetes is a common disease that seriously endangers human health, mostly in the middle-aged and the elderly. Predicting the incidence rate of diabetes enables doctors to make a scientific treatment plan in advance, which will significantly improve the cure rate and reduce the incidence rate. Based on this situation, this paper proposes a diabetes prediction model based on ensemble learning, which integrates some classical machine learning algorithms, including Logisticregression, Kneigbors, Decisiontree, GaussianNB, and support vector machine (SVM) The first four low correlation algorithms are constructed as basic learners, and then integrated into meta learner SVM to build an integrated learning model. The advantages of the comprehensive model are evaluated from the following aspects: accuracy, precision, recall rate, AUC, and other evaluation indicators. The experiment was carried out on the Pima Indian diabetes data set (PIDD) published by UCI. First, the XGboost algorithm was used to select the optimal features, and then an integrated learning model was constructed to predict. The experimental results show that the accuracy rate of the integrated learning model is 81.63%, the precision rate is 80%, the recall rate is 80%, and the AUC is 84%. The advantages of the model in accuracy, precision, recall, and AUC are verified. The model will effectively help doctors make more accurate diagnoses and predictions of patients' physical conditions and implement more scientific treatment.\",\"PeriodicalId\":103293,\"journal\":{\"name\":\"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3573942.3573949\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3573942.3573949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

摘要:糖尿病是严重危害人类健康的常见病,多见于中老年人。预测糖尿病的发病率可以使医生提前制定科学的治疗方案,从而显著提高治愈率,降低发病率。基于这种情况,本文提出了一种基于集成学习的糖尿病预测模型,该模型集成了Logisticregression、Kneigbors、Decisiontree、GaussianNB、support vector machine (SVM)等经典机器学习算法,将前4种低相关性算法构建为基础学习算法,然后将其集成到元学习SVM中构建集成学习模型。综合模型的优势从以下几个方面进行评价:准确率、精密度、召回率、AUC等评价指标。实验在UCI公布的皮马印第安人糖尿病数据集(PIDD)上进行。首先使用XGboost算法选择最优特征,然后构建集成学习模型进行预测。实验结果表明,该综合学习模型的准确率为81.63%,准确率为80%,召回率为80%,AUC为84%。验证了该模型在准确率、精密度、召回率和AUC方面的优势。该模型将有效帮助医生对患者的身体状况做出更准确的诊断和预测,并实施更科学的治疗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Prediction Model of Diabetes Based on Ensemble Learning
Abstract: Diabetes is a common disease that seriously endangers human health, mostly in the middle-aged and the elderly. Predicting the incidence rate of diabetes enables doctors to make a scientific treatment plan in advance, which will significantly improve the cure rate and reduce the incidence rate. Based on this situation, this paper proposes a diabetes prediction model based on ensemble learning, which integrates some classical machine learning algorithms, including Logisticregression, Kneigbors, Decisiontree, GaussianNB, and support vector machine (SVM) The first four low correlation algorithms are constructed as basic learners, and then integrated into meta learner SVM to build an integrated learning model. The advantages of the comprehensive model are evaluated from the following aspects: accuracy, precision, recall rate, AUC, and other evaluation indicators. The experiment was carried out on the Pima Indian diabetes data set (PIDD) published by UCI. First, the XGboost algorithm was used to select the optimal features, and then an integrated learning model was constructed to predict. The experimental results show that the accuracy rate of the integrated learning model is 81.63%, the precision rate is 80%, the recall rate is 80%, and the AUC is 84%. The advantages of the model in accuracy, precision, recall, and AUC are verified. The model will effectively help doctors make more accurate diagnoses and predictions of patients' physical conditions and implement more scientific treatment.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信