Sushma Jaiswal, Priyanka Gupta, Narasimha Prasad, R. Kulkarni
{"title":"An Empirical Model for The Classification of Diabetes and Diabetes_Types Using Ensemble Approaches","authors":"Sushma Jaiswal, Priyanka Gupta, Narasimha Prasad, R. Kulkarni","doi":"10.37965/jait.2023.0220","DOIUrl":null,"url":null,"abstract":"Diabetes is a hereditary disorder that interferes with human life at all ages. It is challenging for cells to absorb glucose from the bloodstream when an individual has diabetes. The two main subtypes of diabetes are type 1 diabetes and type 2 diabetes. Type 1 diabetes develops when the pancreas cannot make enough insulin, whereas type 2 diabetes spreads due to insulin resistance. Diabetes is a recurrent, and chronic illness that is incurable. In modern healthcare systems, disease detection technology is pervasive. Detecting diabetes in its early stages is crucial for initiating timely treatment and halting disease progression. The proposed method has the potential not only to forecast the likelihood of future diabetes onset but also to identify the specific type of diabetes a person may develop. This paper investigates a potential solution for a diabetes prediction model in light of the continually rising prevalence of diabetes among patients. The proposed framework is designed using two datasets: the Pima Indian dataset, which is used to forecast diabetes, and the Diabetes Type dataset, which is used to identify the type of diabetes mellitus an individual has. This research aims to apply machine learning classifiers and ensemble models, such as Bagging, Voting, Averaging, and Stacking, for diabetes prediction. In this context, SMOTE (Synthetic Minority Oversampling Technique) and hyperparameter adjustment of the algorithms are considered and have substantially improved the findings. The developed heterogeneous ensemble model offers enhanced prediction rates with different performance criteria. Using the bagging technique, Random Forest attains a 96% accuracy rate, resulting in better predictions in the PID dataset. Regarding the Diabetes Type dataset, the Voting Ensemble Model provides a 98.5% accuracy rate. This study highlights that Ensemble learning models are effective in predicting diabetes and can outperform earlier relevant studies.","PeriodicalId":70996,"journal":{"name":"人工智能技术学报(英文)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"人工智能技术学报(英文)","FirstCategoryId":"1093","ListUrlMain":"https://doi.org/10.37965/jait.2023.0220","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Diabetes is a hereditary disorder that interferes with human life at all ages. It is challenging for cells to absorb glucose from the bloodstream when an individual has diabetes. The two main subtypes of diabetes are type 1 diabetes and type 2 diabetes. Type 1 diabetes develops when the pancreas cannot make enough insulin, whereas type 2 diabetes spreads due to insulin resistance. Diabetes is a recurrent, and chronic illness that is incurable. In modern healthcare systems, disease detection technology is pervasive. Detecting diabetes in its early stages is crucial for initiating timely treatment and halting disease progression. The proposed method has the potential not only to forecast the likelihood of future diabetes onset but also to identify the specific type of diabetes a person may develop. This paper investigates a potential solution for a diabetes prediction model in light of the continually rising prevalence of diabetes among patients. The proposed framework is designed using two datasets: the Pima Indian dataset, which is used to forecast diabetes, and the Diabetes Type dataset, which is used to identify the type of diabetes mellitus an individual has. This research aims to apply machine learning classifiers and ensemble models, such as Bagging, Voting, Averaging, and Stacking, for diabetes prediction. In this context, SMOTE (Synthetic Minority Oversampling Technique) and hyperparameter adjustment of the algorithms are considered and have substantially improved the findings. The developed heterogeneous ensemble model offers enhanced prediction rates with different performance criteria. Using the bagging technique, Random Forest attains a 96% accuracy rate, resulting in better predictions in the PID dataset. Regarding the Diabetes Type dataset, the Voting Ensemble Model provides a 98.5% accuracy rate. This study highlights that Ensemble learning models are effective in predicting diabetes and can outperform earlier relevant studies.