{"title":"基于集成学习的糖尿病预测模型","authors":"Lei Qin","doi":"10.1145/3573942.3573949","DOIUrl":null,"url":null,"abstract":"Abstract: Diabetes is a common disease that seriously endangers human health, mostly in the middle-aged and the elderly. Predicting the incidence rate of diabetes enables doctors to make a scientific treatment plan in advance, which will significantly improve the cure rate and reduce the incidence rate. Based on this situation, this paper proposes a diabetes prediction model based on ensemble learning, which integrates some classical machine learning algorithms, including Logisticregression, Kneigbors, Decisiontree, GaussianNB, and support vector machine (SVM) The first four low correlation algorithms are constructed as basic learners, and then integrated into meta learner SVM to build an integrated learning model. The advantages of the comprehensive model are evaluated from the following aspects: accuracy, precision, recall rate, AUC, and other evaluation indicators. The experiment was carried out on the Pima Indian diabetes data set (PIDD) published by UCI. First, the XGboost algorithm was used to select the optimal features, and then an integrated learning model was constructed to predict. The experimental results show that the accuracy rate of the integrated learning model is 81.63%, the precision rate is 80%, the recall rate is 80%, and the AUC is 84%. The advantages of the model in accuracy, precision, recall, and AUC are verified. The model will effectively help doctors make more accurate diagnoses and predictions of patients' physical conditions and implement more scientific treatment.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Prediction Model of Diabetes Based on Ensemble Learning\",\"authors\":\"Lei Qin\",\"doi\":\"10.1145/3573942.3573949\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract: Diabetes is a common disease that seriously endangers human health, mostly in the middle-aged and the elderly. Predicting the incidence rate of diabetes enables doctors to make a scientific treatment plan in advance, which will significantly improve the cure rate and reduce the incidence rate. Based on this situation, this paper proposes a diabetes prediction model based on ensemble learning, which integrates some classical machine learning algorithms, including Logisticregression, Kneigbors, Decisiontree, GaussianNB, and support vector machine (SVM) The first four low correlation algorithms are constructed as basic learners, and then integrated into meta learner SVM to build an integrated learning model. The advantages of the comprehensive model are evaluated from the following aspects: accuracy, precision, recall rate, AUC, and other evaluation indicators. The experiment was carried out on the Pima Indian diabetes data set (PIDD) published by UCI. First, the XGboost algorithm was used to select the optimal features, and then an integrated learning model was constructed to predict. The experimental results show that the accuracy rate of the integrated learning model is 81.63%, the precision rate is 80%, the recall rate is 80%, and the AUC is 84%. The advantages of the model in accuracy, precision, recall, and AUC are verified. The model will effectively help doctors make more accurate diagnoses and predictions of patients' physical conditions and implement more scientific treatment.\",\"PeriodicalId\":103293,\"journal\":{\"name\":\"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3573942.3573949\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3573942.3573949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Prediction Model of Diabetes Based on Ensemble Learning
Abstract: Diabetes is a common disease that seriously endangers human health, mostly in the middle-aged and the elderly. Predicting the incidence rate of diabetes enables doctors to make a scientific treatment plan in advance, which will significantly improve the cure rate and reduce the incidence rate. Based on this situation, this paper proposes a diabetes prediction model based on ensemble learning, which integrates some classical machine learning algorithms, including Logisticregression, Kneigbors, Decisiontree, GaussianNB, and support vector machine (SVM) The first four low correlation algorithms are constructed as basic learners, and then integrated into meta learner SVM to build an integrated learning model. The advantages of the comprehensive model are evaluated from the following aspects: accuracy, precision, recall rate, AUC, and other evaluation indicators. The experiment was carried out on the Pima Indian diabetes data set (PIDD) published by UCI. First, the XGboost algorithm was used to select the optimal features, and then an integrated learning model was constructed to predict. The experimental results show that the accuracy rate of the integrated learning model is 81.63%, the precision rate is 80%, the recall rate is 80%, and the AUC is 84%. The advantages of the model in accuracy, precision, recall, and AUC are verified. The model will effectively help doctors make more accurate diagnoses and predictions of patients' physical conditions and implement more scientific treatment.