Breast Cancer Diagnosis using Simultaneous Feature Selection and Classification: A Genetic Programming Approach

Harshit Bhardwaj, Aditi Sakalle, Arpit Bhardwaj, Aruna Tiwari, M. Verma
{"title":"Breast Cancer Diagnosis using Simultaneous Feature Selection and Classification: A Genetic Programming Approach","authors":"Harshit Bhardwaj, Aditi Sakalle, Arpit Bhardwaj, Aruna Tiwari, M. Verma","doi":"10.1109/SSCI.2018.8628935","DOIUrl":null,"url":null,"abstract":"Breast cancer is the most prevalent type of cancer found in women worldwide. It is becoming a leading cause of death among women in the whole world. Early detection and effective treatment of this disease is the only rescue to reduce breast cancer mortality. Because of the effective classification and high diagnostic capability expert systems are gaining popularity in this field. But the problem with machine learning algorithms is that if redundant and irrelevant features are available in the dataset then they are not being able to achieve desired performance. Therefore, in this paper, a simultaneous feature selection and classification technique using Genetic Programming (GPsfsc) is proposed for breast cancer diagnosis. To demonstrate our results, we had taken the Wisconsin Breast Cancer (WBC) and Wisconsin Diagnostic Breast Cancer (WDBC) databases from UCI Machine Learning repository and compared the classification accuracy, sensitivity, specificity, confusion matrix, and Mann Whitney test results of GONN with classical multi-tree GP algorithm for feature selection (GPmtfs). The experimental results on WBC and WDBC datasets show that the proposed method produces better classification accuracy with reduced features. Therefore, our proposed method is of great significance and can serve as first-rate clinical tool for the detection of breast cancer.","PeriodicalId":235735,"journal":{"name":"2018 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Symposium Series on Computational Intelligence (SSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSCI.2018.8628935","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Breast cancer is the most prevalent type of cancer found in women worldwide. It is becoming a leading cause of death among women in the whole world. Early detection and effective treatment of this disease is the only rescue to reduce breast cancer mortality. Because of the effective classification and high diagnostic capability expert systems are gaining popularity in this field. But the problem with machine learning algorithms is that if redundant and irrelevant features are available in the dataset then they are not being able to achieve desired performance. Therefore, in this paper, a simultaneous feature selection and classification technique using Genetic Programming (GPsfsc) is proposed for breast cancer diagnosis. To demonstrate our results, we had taken the Wisconsin Breast Cancer (WBC) and Wisconsin Diagnostic Breast Cancer (WDBC) databases from UCI Machine Learning repository and compared the classification accuracy, sensitivity, specificity, confusion matrix, and Mann Whitney test results of GONN with classical multi-tree GP algorithm for feature selection (GPmtfs). The experimental results on WBC and WDBC datasets show that the proposed method produces better classification accuracy with reduced features. Therefore, our proposed method is of great significance and can serve as first-rate clinical tool for the detection of breast cancer.
同时使用特征选择和分类的乳腺癌诊断:一种遗传规划方法
乳腺癌是全世界女性中最常见的癌症类型。它正在成为全世界妇女死亡的主要原因。早期发现和有效治疗是降低乳腺癌死亡率的唯一办法。由于有效的分类和高诊断能力,专家系统在这一领域越来越受欢迎。但机器学习算法的问题在于,如果数据集中存在冗余和不相关的特征,那么它们就无法达到预期的性能。因此,本文提出了一种基于遗传规划(GPsfsc)的乳腺癌诊断同步特征选择和分类技术。为了证明我们的结果,我们从UCI机器学习存储库中获取了威斯康星乳腺癌(WBC)和威斯康星诊断乳腺癌(WDBC)数据库,并将GONN的分类精度、灵敏度、特异性、混淆矩阵和Mann Whitney测试结果与经典的多树GP特征选择算法(GPmtfs)进行了比较。在WBC和WDBC数据集上的实验结果表明,该方法在特征约简的情况下具有较好的分类精度。因此,我们提出的方法具有重要意义,可以作为乳腺癌临床检测的一流工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信