使用WEKA进行乳腺癌诊断的机器学习技术比较分析

Afrah Rashid, Syeda Sohana Binta Farhad, Afsana Bhuyian, N. Yeasmin, Mohammad Abdul Azim, Z. Alom
{"title":"使用WEKA进行乳腺癌诊断的机器学习技术比较分析","authors":"Afrah Rashid, Syeda Sohana Binta Farhad, Afsana Bhuyian, N. Yeasmin, Mohammad Abdul Azim, Z. Alom","doi":"10.1109/ICCIT57492.2022.10055421","DOIUrl":null,"url":null,"abstract":"Breast cancer is one of the most common malignancies affecting women worldwide, with many fatalities yearly. The risk of death suffered by breast cancer is increasing exponentially. Due to a surge of development of research in the medical field, providing more timely and possible early detection of disease has become a time-demanding option. By far, radiologists have manually checked cancer images and diagnosed them. Research has shown that a considerable number of ultrasound images are created every individual day. However, the number of radiologists is limited, so they cannot provide service on time. However, they often misclassify breast lesions, resulting in a high false-positive rate. An automatic system for detecting disease assists radiologists in disease diagnosis and provides reliable, productive, and reduces the risk of death. In this paper, we compare six machine learning models, namely (i) Support Vector Machine (SVM), (ii) Naive Bayes (NB), (iii) Logistic Regression (LR), (iv) Decision Tree (DT), (v) Random Forest (RF), and (vi) k-Nearest Neighbors (k-NN) on two different datasets (i) the Wisconsin Breast Cancer Dataset (WBCD) and (ii) the Breast Cancer Coimbra Dataset (BCCD). This study aims to create different classification models to analyze the obtained results and compare them to predict breast cancer. We use several performance metrics to select the best classification model among them. Our comparative analysis shows that SVM models can achieve better performance metrics, and thus the model of this research possesses relevant to use in clinical applications.","PeriodicalId":255498,"journal":{"name":"2022 25th International Conference on Computer and Information Technology (ICCIT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Comparative Analysis of Machine Learning techniques on Breast Cancer diagnosis using WEKA\",\"authors\":\"Afrah Rashid, Syeda Sohana Binta Farhad, Afsana Bhuyian, N. Yeasmin, Mohammad Abdul Azim, Z. Alom\",\"doi\":\"10.1109/ICCIT57492.2022.10055421\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Breast cancer is one of the most common malignancies affecting women worldwide, with many fatalities yearly. The risk of death suffered by breast cancer is increasing exponentially. Due to a surge of development of research in the medical field, providing more timely and possible early detection of disease has become a time-demanding option. By far, radiologists have manually checked cancer images and diagnosed them. Research has shown that a considerable number of ultrasound images are created every individual day. However, the number of radiologists is limited, so they cannot provide service on time. However, they often misclassify breast lesions, resulting in a high false-positive rate. An automatic system for detecting disease assists radiologists in disease diagnosis and provides reliable, productive, and reduces the risk of death. In this paper, we compare six machine learning models, namely (i) Support Vector Machine (SVM), (ii) Naive Bayes (NB), (iii) Logistic Regression (LR), (iv) Decision Tree (DT), (v) Random Forest (RF), and (vi) k-Nearest Neighbors (k-NN) on two different datasets (i) the Wisconsin Breast Cancer Dataset (WBCD) and (ii) the Breast Cancer Coimbra Dataset (BCCD). This study aims to create different classification models to analyze the obtained results and compare them to predict breast cancer. We use several performance metrics to select the best classification model among them. Our comparative analysis shows that SVM models can achieve better performance metrics, and thus the model of this research possesses relevant to use in clinical applications.\",\"PeriodicalId\":255498,\"journal\":{\"name\":\"2022 25th International Conference on Computer and Information Technology (ICCIT)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 25th International Conference on Computer and Information Technology (ICCIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCIT57492.2022.10055421\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 25th International Conference on Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIT57492.2022.10055421","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

乳腺癌是影响全世界妇女的最常见的恶性肿瘤之一,每年有许多人死亡。乳腺癌导致的死亡风险呈指数增长。由于医学领域研究的迅猛发展,提供更及时和可能的早期疾病检测已成为一项耗时的选择。到目前为止,放射科医生已经手动检查了癌症图像并进行了诊断。研究表明,每天都会产生相当数量的超声波图像。然而,放射科医生的数量有限,因此他们不能按时提供服务。然而,他们经常误诊乳腺病变,导致高假阳性率。用于检测疾病的自动系统协助放射科医生进行疾病诊断,并提供可靠、高效和降低死亡风险的服务。在本文中,我们比较了六种机器学习模型,即(i)支持向量机(SVM), (ii)朴素贝叶斯(NB), (iii)逻辑回归(LR), (iv)决策树(DT), (v)随机森林(RF)和(vi) k-近邻(k-NN)在两个不同数据集上(i)威斯康星州乳腺癌数据集(WBCD)和(ii)科英布拉乳腺癌数据集(BCCD)。本研究旨在建立不同的分类模型,对得到的结果进行分析和比较,以预测乳腺癌。我们使用几个性能指标从中选择最佳的分类模型。我们的对比分析表明,SVM模型可以获得更好的性能指标,因此本研究的模型具有临床应用的相关性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Comparative Analysis of Machine Learning techniques on Breast Cancer diagnosis using WEKA
Breast cancer is one of the most common malignancies affecting women worldwide, with many fatalities yearly. The risk of death suffered by breast cancer is increasing exponentially. Due to a surge of development of research in the medical field, providing more timely and possible early detection of disease has become a time-demanding option. By far, radiologists have manually checked cancer images and diagnosed them. Research has shown that a considerable number of ultrasound images are created every individual day. However, the number of radiologists is limited, so they cannot provide service on time. However, they often misclassify breast lesions, resulting in a high false-positive rate. An automatic system for detecting disease assists radiologists in disease diagnosis and provides reliable, productive, and reduces the risk of death. In this paper, we compare six machine learning models, namely (i) Support Vector Machine (SVM), (ii) Naive Bayes (NB), (iii) Logistic Regression (LR), (iv) Decision Tree (DT), (v) Random Forest (RF), and (vi) k-Nearest Neighbors (k-NN) on two different datasets (i) the Wisconsin Breast Cancer Dataset (WBCD) and (ii) the Breast Cancer Coimbra Dataset (BCCD). This study aims to create different classification models to analyze the obtained results and compare them to predict breast cancer. We use several performance metrics to select the best classification model among them. Our comparative analysis shows that SVM models can achieve better performance metrics, and thus the model of this research possesses relevant to use in clinical applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信