Type 2 Diabetes Prediction using K-Nearest Neighbor Algorithm

S. Suriya, J. Joanish Muthu
{"title":"Type 2 Diabetes Prediction using K-Nearest Neighbor Algorithm","authors":"S. Suriya, J. Joanish Muthu","doi":"10.36548/jtcsst.2023.2.007","DOIUrl":null,"url":null,"abstract":"Type 2 diabetes is a persistent disorder that affects millions of individuals globally. It is characterised by the excessive levels of glucose within the blood due to insulin resistance or the incapability to supply insulin. Early detection and prediction of type 2 diabetes can improve patient outcomes. K-Nearest Neighbor (KNN) is used in the present model to predict type 2 diabetes. The KNN set of rules is a simple but powerful machine learning set of rules used for categorization and regression. It's far a non-parametric approach that makes predictions based totally on the nearest k-neighbours in a dataset. KNN is widely used in healthcare and scientific studies to expect and classify sicknesses primarily based on the affected person’s data. The intention of this work is to predict the threat of growing type 2 diabetes using the KNN set of rules. Data has been collected from electronic medical records of patients diagnosed with type 2 diabetes and healthy individuals. The dataset consists of various patient attributes, such as age, gender, body mass index, blood pressure, cholesterol levels, and glucose levels. Information has also been collected about lifestyle habits, such as physical activity, smoking status, and alcohol consumption. Data have been pre-processed by removing missing values and outliers, and normalization of the data has been done to ensure that all features have the same scale. Splitting the dataset into training and test sets, with training sets using 80% of the data and test sets using 20% of the data is performed. KNN algorithm have been used to classify the patients into two groups: those at high risk of developing type 2 diabetes and those at low risk. The model's performance has been assessed using a variety of metrics, including accuracy, precision, recall, and F1-score.","PeriodicalId":107574,"journal":{"name":"Journal of Trends in Computer Science and Smart Technology","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Trends in Computer Science and Smart Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.36548/jtcsst.2023.2.007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Type 2 diabetes is a persistent disorder that affects millions of individuals globally. It is characterised by the excessive levels of glucose within the blood due to insulin resistance or the incapability to supply insulin. Early detection and prediction of type 2 diabetes can improve patient outcomes. K-Nearest Neighbor (KNN) is used in the present model to predict type 2 diabetes. The KNN set of rules is a simple but powerful machine learning set of rules used for categorization and regression. It's far a non-parametric approach that makes predictions based totally on the nearest k-neighbours in a dataset. KNN is widely used in healthcare and scientific studies to expect and classify sicknesses primarily based on the affected person’s data. The intention of this work is to predict the threat of growing type 2 diabetes using the KNN set of rules. Data has been collected from electronic medical records of patients diagnosed with type 2 diabetes and healthy individuals. The dataset consists of various patient attributes, such as age, gender, body mass index, blood pressure, cholesterol levels, and glucose levels. Information has also been collected about lifestyle habits, such as physical activity, smoking status, and alcohol consumption. Data have been pre-processed by removing missing values and outliers, and normalization of the data has been done to ensure that all features have the same scale. Splitting the dataset into training and test sets, with training sets using 80% of the data and test sets using 20% of the data is performed. KNN algorithm have been used to classify the patients into two groups: those at high risk of developing type 2 diabetes and those at low risk. The model's performance has been assessed using a variety of metrics, including accuracy, precision, recall, and F1-score.
基于k -最近邻算法的2型糖尿病预测
2型糖尿病是一种影响全球数百万人的持续性疾病。它的特点是由于胰岛素抵抗或不能提供胰岛素而导致血液中葡萄糖水平过高。2型糖尿病的早期发现和预测可以改善患者的预后。本模型使用k -最近邻(KNN)来预测2型糖尿病。KNN规则集是一个简单但功能强大的机器学习规则集,用于分类和回归。这是一种非参数方法,完全基于数据集中最近的k-邻居进行预测。KNN广泛用于医疗保健和科学研究,主要基于受影响的人的数据来预测和分类疾病。这项工作的目的是使用KNN规则集来预测日益增长的2型糖尿病的威胁。数据是从诊断为2型糖尿病患者和健康个体的电子病历中收集的。该数据集由各种患者属性组成,如年龄、性别、体重指数、血压、胆固醇水平和葡萄糖水平。还收集了有关生活习惯的信息,如体育活动、吸烟状况和饮酒情况。通过去除缺失值和异常值对数据进行预处理,并对数据进行归一化以确保所有特征具有相同的尺度。将数据集分成训练集和测试集,其中训练集使用80%的数据,测试集使用20%的数据。使用KNN算法将患者分为两组:发展为2型糖尿病的高危组和低危组。该模型的性能已使用各种指标进行评估,包括准确性、精度、召回率和f1分数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信