Machine Learning Analysis of Factors Contributing to Diabetes Development

Edgar Ceh-Varela, Larry Maes, Sarbagya Ratna
{"title":"Machine Learning Analysis of Factors Contributing to Diabetes Development","authors":"Edgar Ceh-Varela, Larry Maes, Sarbagya Ratna","doi":"10.37256/ccds.5120243751","DOIUrl":null,"url":null,"abstract":"Diabetes is a chronic condition that affects how the body processes blood sugar. Early diagnosis and management of diabetes are essential for preventing these complications. Machine Learning (ML) techniques offer an effective means to accurately diagnose diabetes by identifying key risk factors and developing predictive models. In this study, we assess the performance of 11 ML algorithms on four diabetes prediction datasets, considering the top 2, top 3, and all attributes. Through k-fold cross-validation, we ensure robust and generalizable results. We use a set of standard evaluation metrics such as accuracy, precision, recall, f1-score, and Receiver Operating Characteristic curve (ROC_AUC). Our analysis aims to determine the optimal number of features and assess how performance changes with feature additions. Notably, some ML classifiers achieve satisfactory classification and predictive abilities using only the top 2 or 3 features. Furthermore, varying dataset performances across algorithms highlight the need for assessing multiple models to identify the most suitable one. These findings enable the creation of dependable models that enhance patient outcomes by leveraging effective algorithms and pertinent features.","PeriodicalId":472066,"journal":{"name":"Cloud computing and data science","volume":"57 24","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cloud computing and data science","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.37256/ccds.5120243751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Diabetes is a chronic condition that affects how the body processes blood sugar. Early diagnosis and management of diabetes are essential for preventing these complications. Machine Learning (ML) techniques offer an effective means to accurately diagnose diabetes by identifying key risk factors and developing predictive models. In this study, we assess the performance of 11 ML algorithms on four diabetes prediction datasets, considering the top 2, top 3, and all attributes. Through k-fold cross-validation, we ensure robust and generalizable results. We use a set of standard evaluation metrics such as accuracy, precision, recall, f1-score, and Receiver Operating Characteristic curve (ROC_AUC). Our analysis aims to determine the optimal number of features and assess how performance changes with feature additions. Notably, some ML classifiers achieve satisfactory classification and predictive abilities using only the top 2 or 3 features. Furthermore, varying dataset performances across algorithms highlight the need for assessing multiple models to identify the most suitable one. These findings enable the creation of dependable models that enhance patient outcomes by leveraging effective algorithms and pertinent features.
对糖尿病发病因素的机器学习分析
糖尿病是一种慢性疾病,会影响人体处理血糖的方式。糖尿病的早期诊断和管理对于预防这些并发症至关重要。机器学习(ML)技术通过识别关键风险因素和开发预测模型,为准确诊断糖尿病提供了有效手段。在本研究中,我们在四个糖尿病预测数据集上评估了 11 种 ML 算法的性能,并考虑了前 2、前 3 和所有属性。通过 k 倍交叉验证,我们确保了结果的稳健性和通用性。我们使用了一系列标准评估指标,如准确率、精确度、召回率、f1-分数和接收者工作特征曲线(ROC_AUC)。我们的分析旨在确定特征的最佳数量,并评估性能如何随着特征的增加而变化。值得注意的是,一些 ML 分类器仅使用前 2 或 3 个特征就能达到令人满意的分类和预测能力。此外,不同算法在数据集上的表现也各不相同,这凸显了评估多个模型以确定最合适模型的必要性。这些发现有助于创建可靠的模型,利用有效的算法和相关特征提高患者的治疗效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信