{"title":"Predicting Diabetes Onset: An Ensemble Supervised Learning Approach","authors":"N. Nnamoko, A. Hussain, D. England","doi":"10.1109/CEC.2018.8477663","DOIUrl":null,"url":null,"abstract":"An exploratory research is presented to gauge the impact of feature selection on heterogeneous ensembles. The task is to predict diabetes onset with healthcare data obtained from UC Irvine (VCI) database. Evidence suggests that accuracy and diversity are the two vital requirements to achieve good ensembles. Therefore, the research presented in this paper exploits diversity from heterogeneous base classifiers; and the optimisation effect of feature subset selection in order to improve accuracy. Five widely used classifiers are employed for the ensembles and a meta-classifier is used to aggregate their outputs. The results are presented and compared with similar studies that used the same dataset within the literature. It is shown that by using the proposed method, diabetes onset prediction can be done with higher accuracy.","PeriodicalId":212677,"journal":{"name":"2018 IEEE Congress on Evolutionary Computation (CEC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Congress on Evolutionary Computation (CEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2018.8477663","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18
Abstract
An exploratory research is presented to gauge the impact of feature selection on heterogeneous ensembles. The task is to predict diabetes onset with healthcare data obtained from UC Irvine (VCI) database. Evidence suggests that accuracy and diversity are the two vital requirements to achieve good ensembles. Therefore, the research presented in this paper exploits diversity from heterogeneous base classifiers; and the optimisation effect of feature subset selection in order to improve accuracy. Five widely used classifiers are employed for the ensembles and a meta-classifier is used to aggregate their outputs. The results are presented and compared with similar studies that used the same dataset within the literature. It is shown that by using the proposed method, diabetes onset prediction can be done with higher accuracy.