{"title":"Predictive analysis using hybrid clustering in diabetes diagnosis","authors":"Kanika Bhatia, Rupali Syal","doi":"10.1109/RDCAPE.2017.8358313","DOIUrl":null,"url":null,"abstract":"Data mining has become crucial in the health care domain for the purpose of predictive analysis. With the discovery of new models, it has become easier to analyze the vast amount of data available in the medical industry. In this research work, K∗-Means has been used for removal of the inconsistency found in the data and for optimal feature selection genetic algorithm is used with SVM for the purpose of classification. K∗-Means is an optimized hierarchical clustering method which aims at reduction of computational cost. The application of the proposed hybrid clustering model applied on Pima Indians Diabetes dataset shows increase in accuracy by 1.351% and in both sensitivity and positive predicted value by 2.0411%. The proposed model attains better results in comparison to the already existing models in the literature.","PeriodicalId":442235,"journal":{"name":"2017 Recent Developments in Control, Automation & Power Engineering (RDCAPE)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Recent Developments in Control, Automation & Power Engineering (RDCAPE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RDCAPE.2017.8358313","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Data mining has become crucial in the health care domain for the purpose of predictive analysis. With the discovery of new models, it has become easier to analyze the vast amount of data available in the medical industry. In this research work, K∗-Means has been used for removal of the inconsistency found in the data and for optimal feature selection genetic algorithm is used with SVM for the purpose of classification. K∗-Means is an optimized hierarchical clustering method which aims at reduction of computational cost. The application of the proposed hybrid clustering model applied on Pima Indians Diabetes dataset shows increase in accuracy by 1.351% and in both sensitivity and positive predicted value by 2.0411%. The proposed model attains better results in comparison to the already existing models in the literature.