基于分类算法的血液分析数据检测癌症

Journal of Artificial Intelligence and Data Mining Pub Date : 2021-07-10 DOI:10.22044/JADM.2021.9839.2116

Oladosu Oyebisi Oladimeji, O. Oladimeji

{"title":"基于分类算法的血液分析数据检测癌症","authors":"Oladosu Oyebisi Oladimeji, O. Oladimeji","doi":"10.22044/JADM.2021.9839.2116","DOIUrl":null,"url":null,"abstract":"Breast cancer is the second major cause of death and accounts for 16% of all cancer deaths worldwide. Most of the methods of detecting breast cancer are very expensive and difficult to interpret such as mammography. There are also limitations such as cumulative radiation exposure, over-diagnosis, false positives and negatives in women with a dense breast which pose certain uncertainties in high-risk population. The objective of this study is Detecting Breast Cancer Through Blood Analysis Data Using Classification Algorithms. This will serve as a complement to these expensive methods. High ranking features were extracted from the dataset. The KNN, SVM and J48 algorithms were used as the training platform to classify 116 instances. Furthermore, 10-fold cross validation and holdout procedures were used coupled with changing of random seed. The result showed that KNN algorithm has the highest and best accuracy of 89.99% and 85.21% for cross validation and holdout procedure respectively. This is followed by the J48 with 84.65% and 75.65% for the two procedures respectively. SVM had 77.58% and 68.69% respectively. Although it was also discovered that Blood Glucose level is a major determinant in detecting breast cancer, it has to be combined with other attributes to make decision as a result of other health issues like diabetes. With the result obtained, women are advised to do regular check-ups including blood analysis in order to know which of the blood components need to be worked on to prevent breast cancer based on the model generated in this study.","PeriodicalId":32592,"journal":{"name":"Journal of Artificial Intelligence and Data Mining","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Detecting Breast Cancer through Blood Analysis Data using Classification Algorithms\",\"authors\":\"Oladosu Oyebisi Oladimeji, O. Oladimeji\",\"doi\":\"10.22044/JADM.2021.9839.2116\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Breast cancer is the second major cause of death and accounts for 16% of all cancer deaths worldwide. Most of the methods of detecting breast cancer are very expensive and difficult to interpret such as mammography. There are also limitations such as cumulative radiation exposure, over-diagnosis, false positives and negatives in women with a dense breast which pose certain uncertainties in high-risk population. The objective of this study is Detecting Breast Cancer Through Blood Analysis Data Using Classification Algorithms. This will serve as a complement to these expensive methods. High ranking features were extracted from the dataset. The KNN, SVM and J48 algorithms were used as the training platform to classify 116 instances. Furthermore, 10-fold cross validation and holdout procedures were used coupled with changing of random seed. The result showed that KNN algorithm has the highest and best accuracy of 89.99% and 85.21% for cross validation and holdout procedure respectively. This is followed by the J48 with 84.65% and 75.65% for the two procedures respectively. SVM had 77.58% and 68.69% respectively. Although it was also discovered that Blood Glucose level is a major determinant in detecting breast cancer, it has to be combined with other attributes to make decision as a result of other health issues like diabetes. With the result obtained, women are advised to do regular check-ups including blood analysis in order to know which of the blood components need to be worked on to prevent breast cancer based on the model generated in this study.\",\"PeriodicalId\":32592,\"journal\":{\"name\":\"Journal of Artificial Intelligence and Data Mining\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Artificial Intelligence and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22044/JADM.2021.9839.2116\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22044/JADM.2021.9839.2116","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

癌症是第二大死亡原因，占全世界癌症死亡人数的16%。大多数检测癌症的方法都非常昂贵，而且很难解释，比如乳房X光检查。还有一些局限性，如累积辐射暴露、过度诊断、乳腺致密女性的假阳性和阴性，这些都给高危人群带来了某些不确定性。本研究的目的是通过使用分类算法的血液分析数据来检测癌症。这将是对这些昂贵方法的补充。从数据集中提取了高级特征。使用KNN、SVM和J48算法作为训练平台对116个实例进行分类。此外，使用10倍交叉验证和保持程序，并改变随机种子。结果表明，KNN算法在交叉验证和拒绝过程中的准确率最高，分别为89.99%和85.21%。其次是J48，两种程序分别为84.65%和75.65%。SVM的支持率分别为77.58%和68.69%。尽管人们还发现血糖水平是检测癌症的主要决定因素，但它必须与其他属性相结合，才能作为糖尿病等其他健康问题的结果做出决定。根据获得的结果，建议女性定期进行检查，包括血液分析，以便根据本研究中生成的模型了解哪些血液成分需要用于预防癌症。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Detecting Breast Cancer through Blood Analysis Data using Classification Algorithms

Breast cancer is the second major cause of death and accounts for 16% of all cancer deaths worldwide. Most of the methods of detecting breast cancer are very expensive and difficult to interpret such as mammography. There are also limitations such as cumulative radiation exposure, over-diagnosis, false positives and negatives in women with a dense breast which pose certain uncertainties in high-risk population. The objective of this study is Detecting Breast Cancer Through Blood Analysis Data Using Classification Algorithms. This will serve as a complement to these expensive methods. High ranking features were extracted from the dataset. The KNN, SVM and J48 algorithms were used as the training platform to classify 116 instances. Furthermore, 10-fold cross validation and holdout procedures were used coupled with changing of random seed. The result showed that KNN algorithm has the highest and best accuracy of 89.99% and 85.21% for cross validation and holdout procedure respectively. This is followed by the J48 with 84.65% and 75.65% for the two procedures respectively. SVM had 77.58% and 68.69% respectively. Although it was also discovered that Blood Glucose level is a major determinant in detecting breast cancer, it has to be combined with other attributes to make decision as a result of other health issues like diabetes. With the result obtained, women are advised to do regular check-ups including blood analysis in order to know which of the blood components need to be worked on to prevent breast cancer based on the model generated in this study.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Artificial Intelligence and Data Mining

自引率

0.00%

发文量

审稿时长

8 weeks