基于基因表达数据的癌症疾病诊断机器学习模型

Suhaam Adnan Abdul kareem, Zena Fouad Rasheed
{"title":"基于基因表达数据的癌症疾病诊断机器学习模型","authors":"Suhaam Adnan Abdul kareem, Zena Fouad Rasheed","doi":"10.31642/jokmc/2018/100227","DOIUrl":null,"url":null,"abstract":"Cancer is one of the top causes of death globally. Recently, microarray gene expression data has been used to aid in cancers effective and early detection. The use of machine learning techniques in biomedicine and bioinformatics to categorize cancer patients into high- or low-risk groups was investigated by numerous research teams. It is necessary that machine learning tools can recognize important features in complex datasets. Here we present a machine learning approach to cancer detection, and to the identification of genes critical for the diagnosis of cancer .We used the Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbor (KNN), and Gradient Boosting (GB) that provide results that are more accurate than those of current models. Each model's accuracy, including SVM, KNN, RF, and GB, was (97.41%, 89.3%, 88.1%, and 85.7%), respectively. The SVM has the highest precision among machine learning algorithms. By creating a machine learning-based predictive system for early detection, our findings can help to decrease the prevalence of cancer disease.","PeriodicalId":499493,"journal":{"name":"Journal of Kufa for Mathematics and Computer","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Machine Learning Model for Cancer Disease Diagnosis using Gene Expression Data\",\"authors\":\"Suhaam Adnan Abdul kareem, Zena Fouad Rasheed\",\"doi\":\"10.31642/jokmc/2018/100227\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cancer is one of the top causes of death globally. Recently, microarray gene expression data has been used to aid in cancers effective and early detection. The use of machine learning techniques in biomedicine and bioinformatics to categorize cancer patients into high- or low-risk groups was investigated by numerous research teams. It is necessary that machine learning tools can recognize important features in complex datasets. Here we present a machine learning approach to cancer detection, and to the identification of genes critical for the diagnosis of cancer .We used the Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbor (KNN), and Gradient Boosting (GB) that provide results that are more accurate than those of current models. Each model's accuracy, including SVM, KNN, RF, and GB, was (97.41%, 89.3%, 88.1%, and 85.7%), respectively. The SVM has the highest precision among machine learning algorithms. By creating a machine learning-based predictive system for early detection, our findings can help to decrease the prevalence of cancer disease.\",\"PeriodicalId\":499493,\"journal\":{\"name\":\"Journal of Kufa for Mathematics and Computer\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Kufa for Mathematics and Computer\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31642/jokmc/2018/100227\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Kufa for Mathematics and Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31642/jokmc/2018/100227","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

癌症是全球最大的死亡原因之一。最近,微阵列基因表达数据已被用于帮助癌症的有效和早期检测。许多研究团队研究了在生物医学和生物信息学中使用机器学习技术将癌症患者分为高风险或低风险组。机器学习工具必须能够识别复杂数据集中的重要特征。在这里,我们提出了一种用于癌症检测和癌症诊断关键基因识别的机器学习方法。我们使用支持向量机(SVM)、随机森林(RF)、k近邻(KNN)和梯度增强(GB),提供比当前模型更准确的结果。SVM、KNN、RF、GB各模型的准确率分别为97.41%、89.3%、88.1%、85.7%。SVM是机器学习算法中精度最高的算法。通过创建一个基于机器学习的早期检测预测系统,我们的发现可以帮助降低癌症疾病的患病率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Machine Learning Model for Cancer Disease Diagnosis using Gene Expression Data
Cancer is one of the top causes of death globally. Recently, microarray gene expression data has been used to aid in cancers effective and early detection. The use of machine learning techniques in biomedicine and bioinformatics to categorize cancer patients into high- or low-risk groups was investigated by numerous research teams. It is necessary that machine learning tools can recognize important features in complex datasets. Here we present a machine learning approach to cancer detection, and to the identification of genes critical for the diagnosis of cancer .We used the Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbor (KNN), and Gradient Boosting (GB) that provide results that are more accurate than those of current models. Each model's accuracy, including SVM, KNN, RF, and GB, was (97.41%, 89.3%, 88.1%, and 85.7%), respectively. The SVM has the highest precision among machine learning algorithms. By creating a machine learning-based predictive system for early detection, our findings can help to decrease the prevalence of cancer disease.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信