利用机器学习分类器识别糖尿病预测的预后生物标志物

Utsha Das, Azmain Yakin Srizon, Md. Ansarul Islam, Dhiman Sikder Tonmoy, Md. Al Mehedi Hasan
{"title":"利用机器学习分类器识别糖尿病预测的预后生物标志物","authors":"Utsha Das, Azmain Yakin Srizon, Md. Ansarul Islam, Dhiman Sikder Tonmoy, Md. Al Mehedi Hasan","doi":"10.1109/STI50764.2020.9350498","DOIUrl":null,"url":null,"abstract":"Diabetes caused 4.2 million deaths in 2019 alone which makes it the seventh leading cause of death worldwide. Although diabetes can be treated, late treatment can be fatal and may result in early death. Moreover, diabetes is a costly disease to maintain, hence, early detection of diabetes can facilitate the patients by indicating the time to seek treatment and to get prepared mentally and financially. Previously, various studies suggested and proposed different approaches for achieving near-perfect accuracy but not many works focused on finding the appropriate attributes which can predict the disease at the early stage. In this study, we focused on finding those significant features and our experimental analysis showed the findings of 10 significant features that can achieve a near-perfect recognition of 98.08%. The feature selection approaches used in this research are the Chi-Square test, the Minimum Redundancy Maximum Relevance (mRMR) test, and the Recursive Feature Elimination test based on Random Forest (RFE-RF). Also, the seven classifiers utilized in this research are Decision Tree (DT), K-Nearest Neighbors (KNN), Logistic Regression (LR), Naïve Bayes (NB), Random Forest (RF), Neural Network (NN), and Support Vector Machine (SVM).","PeriodicalId":242439,"journal":{"name":"2020 2nd International Conference on Sustainable Technologies for Industry 4.0 (STI)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prognostic Biomarkers Identification for Diabetes Prediction by Utilizing Machine Learning Classifiers\",\"authors\":\"Utsha Das, Azmain Yakin Srizon, Md. Ansarul Islam, Dhiman Sikder Tonmoy, Md. Al Mehedi Hasan\",\"doi\":\"10.1109/STI50764.2020.9350498\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Diabetes caused 4.2 million deaths in 2019 alone which makes it the seventh leading cause of death worldwide. Although diabetes can be treated, late treatment can be fatal and may result in early death. Moreover, diabetes is a costly disease to maintain, hence, early detection of diabetes can facilitate the patients by indicating the time to seek treatment and to get prepared mentally and financially. Previously, various studies suggested and proposed different approaches for achieving near-perfect accuracy but not many works focused on finding the appropriate attributes which can predict the disease at the early stage. In this study, we focused on finding those significant features and our experimental analysis showed the findings of 10 significant features that can achieve a near-perfect recognition of 98.08%. The feature selection approaches used in this research are the Chi-Square test, the Minimum Redundancy Maximum Relevance (mRMR) test, and the Recursive Feature Elimination test based on Random Forest (RFE-RF). Also, the seven classifiers utilized in this research are Decision Tree (DT), K-Nearest Neighbors (KNN), Logistic Regression (LR), Naïve Bayes (NB), Random Forest (RF), Neural Network (NN), and Support Vector Machine (SVM).\",\"PeriodicalId\":242439,\"journal\":{\"name\":\"2020 2nd International Conference on Sustainable Technologies for Industry 4.0 (STI)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 2nd International Conference on Sustainable Technologies for Industry 4.0 (STI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/STI50764.2020.9350498\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 2nd International Conference on Sustainable Technologies for Industry 4.0 (STI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/STI50764.2020.9350498","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

仅在2019年,糖尿病就造成420万人死亡,使其成为全球第七大死因。虽然糖尿病可以治疗,但治疗晚了可能是致命的,并可能导致早期死亡。此外,糖尿病是一种昂贵的疾病,因此,早期发现糖尿病可以通过指示寻求治疗的时间,并在精神上和经济上做好准备,从而方便患者。以前,各种各样的研究提出并提出了不同的方法来实现近乎完美的准确性,但没有多少工作集中在寻找适当的属性,可以在早期阶段预测疾病。在本研究中,我们重点寻找那些显著特征,我们的实验分析显示,10个显著特征的发现可以达到近乎完美的98.08%的识别率。本研究使用的特征选择方法有卡方检验、最小冗余最大相关性(mRMR)检验和基于随机森林的递归特征消除检验(RFE-RF)。此外,本研究中使用的七种分类器是决策树(DT), k近邻(KNN),逻辑回归(LR), Naïve贝叶斯(NB),随机森林(RF),神经网络(NN)和支持向量机(SVM)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Prognostic Biomarkers Identification for Diabetes Prediction by Utilizing Machine Learning Classifiers
Diabetes caused 4.2 million deaths in 2019 alone which makes it the seventh leading cause of death worldwide. Although diabetes can be treated, late treatment can be fatal and may result in early death. Moreover, diabetes is a costly disease to maintain, hence, early detection of diabetes can facilitate the patients by indicating the time to seek treatment and to get prepared mentally and financially. Previously, various studies suggested and proposed different approaches for achieving near-perfect accuracy but not many works focused on finding the appropriate attributes which can predict the disease at the early stage. In this study, we focused on finding those significant features and our experimental analysis showed the findings of 10 significant features that can achieve a near-perfect recognition of 98.08%. The feature selection approaches used in this research are the Chi-Square test, the Minimum Redundancy Maximum Relevance (mRMR) test, and the Recursive Feature Elimination test based on Random Forest (RFE-RF). Also, the seven classifiers utilized in this research are Decision Tree (DT), K-Nearest Neighbors (KNN), Logistic Regression (LR), Naïve Bayes (NB), Random Forest (RF), Neural Network (NN), and Support Vector Machine (SVM).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信