{"title":"基于特征选择算法的优化计算糖尿病预测","authors":"Xi Li, Michele Curiger, Rolf Dornberger, T. Hanne","doi":"10.1145/3596947.3596948","DOIUrl":null,"url":null,"abstract":"Diabetes is a life-threatening disease that should be diagnosed and treated as early as possible. In this paper, Recursive Feature Elimination (RFE) and a Genetic Algorithm (GA) have been used for the Feature Selection (FS) of two different diabetes datasets of different patient heritages, in combination with K-Nearest Neighbors (KNN) and Random Forest (RF) classifiers for an optimized diabetes prediction. In our paper, RF shows a better performance compared to KNN. The level of accuracy also highly depends on the dataset used. The Iraqi Society Diabetes (ISD) dataset results in a notably higher accuracy than the Pima Indian Diabetes (PID) dataset using the same FS and classification method. The performance of KNN has been improved by combining it with RFE or GA for the FS, while RF deteriorates when applied in combination with. GA is computationally less efficient than RFE and shows a lower accuracy.","PeriodicalId":183071,"journal":{"name":"Proceedings of the 2023 7th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Optimized Computational Diabetes Prediction with Feature Selection Algorithms\",\"authors\":\"Xi Li, Michele Curiger, Rolf Dornberger, T. Hanne\",\"doi\":\"10.1145/3596947.3596948\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Diabetes is a life-threatening disease that should be diagnosed and treated as early as possible. In this paper, Recursive Feature Elimination (RFE) and a Genetic Algorithm (GA) have been used for the Feature Selection (FS) of two different diabetes datasets of different patient heritages, in combination with K-Nearest Neighbors (KNN) and Random Forest (RF) classifiers for an optimized diabetes prediction. In our paper, RF shows a better performance compared to KNN. The level of accuracy also highly depends on the dataset used. The Iraqi Society Diabetes (ISD) dataset results in a notably higher accuracy than the Pima Indian Diabetes (PID) dataset using the same FS and classification method. The performance of KNN has been improved by combining it with RFE or GA for the FS, while RF deteriorates when applied in combination with. GA is computationally less efficient than RFE and shows a lower accuracy.\",\"PeriodicalId\":183071,\"journal\":{\"name\":\"Proceedings of the 2023 7th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 7th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3596947.3596948\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 7th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3596947.3596948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optimized Computational Diabetes Prediction with Feature Selection Algorithms
Diabetes is a life-threatening disease that should be diagnosed and treated as early as possible. In this paper, Recursive Feature Elimination (RFE) and a Genetic Algorithm (GA) have been used for the Feature Selection (FS) of two different diabetes datasets of different patient heritages, in combination with K-Nearest Neighbors (KNN) and Random Forest (RF) classifiers for an optimized diabetes prediction. In our paper, RF shows a better performance compared to KNN. The level of accuracy also highly depends on the dataset used. The Iraqi Society Diabetes (ISD) dataset results in a notably higher accuracy than the Pima Indian Diabetes (PID) dataset using the same FS and classification method. The performance of KNN has been improved by combining it with RFE or GA for the FS, while RF deteriorates when applied in combination with. GA is computationally less efficient than RFE and shows a lower accuracy.