{"title":"基于基因表达数据的癌症疾病诊断机器学习模型","authors":"Suhaam Adnan Abdul kareem, Zena Fouad Rasheed","doi":"10.31642/jokmc/2018/100227","DOIUrl":null,"url":null,"abstract":"Cancer is one of the top causes of death globally. Recently, microarray gene expression data has been used to aid in cancers effective and early detection. The use of machine learning techniques in biomedicine and bioinformatics to categorize cancer patients into high- or low-risk groups was investigated by numerous research teams. It is necessary that machine learning tools can recognize important features in complex datasets. Here we present a machine learning approach to cancer detection, and to the identification of genes critical for the diagnosis of cancer .We used the Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbor (KNN), and Gradient Boosting (GB) that provide results that are more accurate than those of current models. Each model's accuracy, including SVM, KNN, RF, and GB, was (97.41%, 89.3%, 88.1%, and 85.7%), respectively. The SVM has the highest precision among machine learning algorithms. By creating a machine learning-based predictive system for early detection, our findings can help to decrease the prevalence of cancer disease.","PeriodicalId":499493,"journal":{"name":"Journal of Kufa for Mathematics and Computer","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Machine Learning Model for Cancer Disease Diagnosis using Gene Expression Data\",\"authors\":\"Suhaam Adnan Abdul kareem, Zena Fouad Rasheed\",\"doi\":\"10.31642/jokmc/2018/100227\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cancer is one of the top causes of death globally. Recently, microarray gene expression data has been used to aid in cancers effective and early detection. The use of machine learning techniques in biomedicine and bioinformatics to categorize cancer patients into high- or low-risk groups was investigated by numerous research teams. It is necessary that machine learning tools can recognize important features in complex datasets. Here we present a machine learning approach to cancer detection, and to the identification of genes critical for the diagnosis of cancer .We used the Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbor (KNN), and Gradient Boosting (GB) that provide results that are more accurate than those of current models. Each model's accuracy, including SVM, KNN, RF, and GB, was (97.41%, 89.3%, 88.1%, and 85.7%), respectively. The SVM has the highest precision among machine learning algorithms. By creating a machine learning-based predictive system for early detection, our findings can help to decrease the prevalence of cancer disease.\",\"PeriodicalId\":499493,\"journal\":{\"name\":\"Journal of Kufa for Mathematics and Computer\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Kufa for Mathematics and Computer\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31642/jokmc/2018/100227\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Kufa for Mathematics and Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31642/jokmc/2018/100227","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Machine Learning Model for Cancer Disease Diagnosis using Gene Expression Data
Cancer is one of the top causes of death globally. Recently, microarray gene expression data has been used to aid in cancers effective and early detection. The use of machine learning techniques in biomedicine and bioinformatics to categorize cancer patients into high- or low-risk groups was investigated by numerous research teams. It is necessary that machine learning tools can recognize important features in complex datasets. Here we present a machine learning approach to cancer detection, and to the identification of genes critical for the diagnosis of cancer .We used the Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbor (KNN), and Gradient Boosting (GB) that provide results that are more accurate than those of current models. Each model's accuracy, including SVM, KNN, RF, and GB, was (97.41%, 89.3%, 88.1%, and 85.7%), respectively. The SVM has the highest precision among machine learning algorithms. By creating a machine learning-based predictive system for early detection, our findings can help to decrease the prevalence of cancer disease.