Diabetic Prediction using Feature Selection based Random Forest and Fine Tuned K-Nearest Neighbor Classifier Algorithm-A Design Thinking Approach

S. Ramya, Dr T. Vijayaraghavan, D. Kalaivani
{"title":"Diabetic Prediction using Feature Selection based Random Forest and Fine Tuned K-Nearest Neighbor Classifier Algorithm-A Design Thinking Approach","authors":"S. Ramya, Dr T. Vijayaraghavan, D. Kalaivani","doi":"10.1109/ICESC57686.2023.10193333","DOIUrl":null,"url":null,"abstract":"In low- and middle-income nations today, diabetes affects the majority of the population, according to a World Health organization (WHO) research. The WHO report suggested that 80% of the deaths would be due to the diabetes from 2016 to 2030. However, the current method continues to provide findings that are erroneous, which has a substantial negative impact on performance. To overcome the abovementioned issue, in this work, Random Forest (RF) algorithm and Fine tuned K-Nearest Neighbor (FKNN) classifier algorithm is proposed. Pre-processing, feature selection, and classification are the three primary stages of this project. Initially, preprocessing is performing for improving the final dataset results more accurately. Preprocessing is the process of cleaning the database into correct format. In order to choose more relevant and useful data from the dataset, the feature selection is then carried out utilizing the RF algorithm. It also minimizes the risk of over fitting with minimum features. Finally, diabetic prediction and classification is done by using FKNN classifier algorithm is used for categorizing items in the feature space based on training samples that are the most similar to the objects being classified. According to the experimental results, the suggested RF+FKNN method outperforms the current algorithms in accuracy, precision, recall, and f-measure.","PeriodicalId":235381,"journal":{"name":"2023 4th International Conference on Electronics and Sustainable Communication Systems (ICESC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 4th International Conference on Electronics and Sustainable Communication Systems (ICESC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICESC57686.2023.10193333","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In low- and middle-income nations today, diabetes affects the majority of the population, according to a World Health organization (WHO) research. The WHO report suggested that 80% of the deaths would be due to the diabetes from 2016 to 2030. However, the current method continues to provide findings that are erroneous, which has a substantial negative impact on performance. To overcome the abovementioned issue, in this work, Random Forest (RF) algorithm and Fine tuned K-Nearest Neighbor (FKNN) classifier algorithm is proposed. Pre-processing, feature selection, and classification are the three primary stages of this project. Initially, preprocessing is performing for improving the final dataset results more accurately. Preprocessing is the process of cleaning the database into correct format. In order to choose more relevant and useful data from the dataset, the feature selection is then carried out utilizing the RF algorithm. It also minimizes the risk of over fitting with minimum features. Finally, diabetic prediction and classification is done by using FKNN classifier algorithm is used for categorizing items in the feature space based on training samples that are the most similar to the objects being classified. According to the experimental results, the suggested RF+FKNN method outperforms the current algorithms in accuracy, precision, recall, and f-measure.
基于特征选择的随机森林和微调k近邻分类器算法的糖尿病预测-一种设计思维方法
根据世界卫生组织(WHO)的一项研究,在当今的低收入和中等收入国家,糖尿病影响着大多数人口。世界卫生组织的报告显示,从2016年到2030年,80%的死亡将由糖尿病引起。然而,目前的方法继续提供错误的结果,这对性能有很大的负面影响。为了克服上述问题,本文提出了随机森林(Random Forest, RF)算法和微调k近邻(Fine tuning K-Nearest Neighbor, FKNN)分类器算法。预处理、特征选择和分类是本项目的三个主要阶段。最初,预处理是为了提高最终数据集结果的准确性。预处理是将数据库清理成正确格式的过程。为了从数据集中选择更多相关和有用的数据,然后利用RF算法进行特征选择。它还将过度拟合最小特征的风险降到最低。最后,利用FKNN分类器算法对特征空间中与被分类对象最相似的训练样本进行分类。实验结果表明,本文提出的RF+FKNN方法在准确率、精密度、召回率和f-measure等方面均优于现有算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信