Developing a Predictive Supervised Machine Learning Models for Diabetes

Divya Kaur Bhullar, Natassha Shievanie Selvaraj, Fung Teng Choong, Chen Wan Jing, K. Xiaoxi, D. Handayani, N. Hamzah, M. Lubis, T. Mantoro
{"title":"Developing a Predictive Supervised Machine Learning Models for Diabetes","authors":"Divya Kaur Bhullar, Natassha Shievanie Selvaraj, Fung Teng Choong, Chen Wan Jing, K. Xiaoxi, D. Handayani, N. Hamzah, M. Lubis, T. Mantoro","doi":"10.1109/ICCED53389.2021.9664833","DOIUrl":null,"url":null,"abstract":"The growing number of diabetes cases today are often diagnosed late or even goes unnoticed altogether until it is in a later stage. One of the dominant explanations for this trend is the scarcity of prediction tools and techniques for this disease. Previous research has demonstrated that early prediction of diabetes can lower the risks of major health implications and increase the possibility of making improved treatment decisions for patients. This study attempts to design a model to predict diabetes based on patient’s risk factors and lifestyles. We use data from the National Institute of Diabetes and Digestive and Kidney Diseases to visualise data to understand correlations between 9 variables. We then perform data mining using Logistic Regression, Random Forests and Decision Tree to compare the best performance in accuracy and F1-score. Our findings indicate that the prediction model using the Random Forrest classifier algorithm has the highest accuracy percentage of 79.4% in predicting diabetes compared to the other two classifier algorithms.","PeriodicalId":6800,"journal":{"name":"2021 IEEE 7th International Conference on Computing, Engineering and Design (ICCED)","volume":"1 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 7th International Conference on Computing, Engineering and Design (ICCED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCED53389.2021.9664833","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The growing number of diabetes cases today are often diagnosed late or even goes unnoticed altogether until it is in a later stage. One of the dominant explanations for this trend is the scarcity of prediction tools and techniques for this disease. Previous research has demonstrated that early prediction of diabetes can lower the risks of major health implications and increase the possibility of making improved treatment decisions for patients. This study attempts to design a model to predict diabetes based on patient’s risk factors and lifestyles. We use data from the National Institute of Diabetes and Digestive and Kidney Diseases to visualise data to understand correlations between 9 variables. We then perform data mining using Logistic Regression, Random Forests and Decision Tree to compare the best performance in accuracy and F1-score. Our findings indicate that the prediction model using the Random Forrest classifier algorithm has the highest accuracy percentage of 79.4% in predicting diabetes compared to the other two classifier algorithms.
糖尿病预测监督机器学习模型的开发
今天,越来越多的糖尿病病例往往被诊断得很晚,甚至完全被忽视,直到它处于后期阶段。对这一趋势的主要解释之一是缺乏这种疾病的预测工具和技术。先前的研究表明,糖尿病的早期预测可以降低重大健康影响的风险,并增加为患者做出改进治疗决策的可能性。本研究试图设计一个基于患者危险因素和生活方式的糖尿病预测模型。我们使用国家糖尿病、消化和肾脏疾病研究所的数据来可视化数据,以了解9个变量之间的相关性。然后,我们使用逻辑回归,随机森林和决策树进行数据挖掘,以比较准确性和f1得分的最佳表现。我们的研究结果表明,与其他两种分类器算法相比,使用Random Forrest分类器算法的预测模型预测糖尿病的准确率最高,为79.4%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信