Backward Elimination for Feature Selection on Breast Cancer Classification Using Logistic Regression and Support Vector Machine Algorithms

Salsha Farahdiba, Dwi Kartini, Radityo Adi Nugroho, Rudy Herteno, Triando Hamonangan Saragih
{"title":"Backward Elimination for Feature Selection on Breast Cancer Classification Using Logistic Regression and Support Vector Machine Algorithms","authors":"Salsha Farahdiba, Dwi Kartini, Radityo Adi Nugroho, Rudy Herteno, Triando Hamonangan Saragih","doi":"10.22146/ijccs.88926","DOIUrl":null,"url":null,"abstract":"Breast cancer is a prevalent form of cancer that afflicts women across all nations globally. One of the ways that can be done as a prevention to reduce elevated fatality due to breast cancer is with a detection system that can determine whether a cancer is benign or malignant. Logistic Regression and Support Vector Machine (SVM) classification algorithms are often used to detect this disease, but the use of these two algorithms often doesn’t give optimal results when applied to datasets that have many features, so additional algorithm is needed to improve classification performance by using Backward Elimination feature selection. The comparison of Logistic Regression and SVM algorithms was carried out by applying feature selection to breast cancer data to see the best model. The breast cancer dataset has 30 features and two classes, Benign and Malignant. Backward Elimination has reduced features from 30 features to 13 features, thereby increasing the performance of both classification models. The best classification was obtained by using the Backward Elimination feature selection and linear kernel SVM with an increase in accuracy value from 96.14% to 97.02%, precision from 98.06% to 99.49%, recall from 90.48% to 92.38%, and the AUC from 0.95 to 0.96.","PeriodicalId":31625,"journal":{"name":"IJCCS Indonesian Journal of Computing and Cybernetics Systems","volume":"7 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IJCCS Indonesian Journal of Computing and Cybernetics Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22146/ijccs.88926","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Breast cancer is a prevalent form of cancer that afflicts women across all nations globally. One of the ways that can be done as a prevention to reduce elevated fatality due to breast cancer is with a detection system that can determine whether a cancer is benign or malignant. Logistic Regression and Support Vector Machine (SVM) classification algorithms are often used to detect this disease, but the use of these two algorithms often doesn’t give optimal results when applied to datasets that have many features, so additional algorithm is needed to improve classification performance by using Backward Elimination feature selection. The comparison of Logistic Regression and SVM algorithms was carried out by applying feature selection to breast cancer data to see the best model. The breast cancer dataset has 30 features and two classes, Benign and Malignant. Backward Elimination has reduced features from 30 features to 13 features, thereby increasing the performance of both classification models. The best classification was obtained by using the Backward Elimination feature selection and linear kernel SVM with an increase in accuracy value from 96.14% to 97.02%, precision from 98.06% to 99.49%, recall from 90.48% to 92.38%, and the AUC from 0.95 to 0.96.
基于逻辑回归和支持向量机算法的乳腺癌分类特征选择的反向消除
乳腺癌是一种普遍存在的癌症,折磨着全球所有国家的女性。预防乳腺癌的方法之一就是通过检测系统来确定癌症是良性的还是恶性的。通常使用逻辑回归(Logistic Regression)和支持向量机(Support Vector Machine, SVM)分类算法来检测这种疾病,但当应用于特征较多的数据集时,这两种算法往往不能给出最优的结果,因此需要额外的算法来提高分类性能,使用反向消去特征选择。通过对乳腺癌数据进行特征选择,比较Logistic回归和SVM算法,找出最佳模型。乳腺癌数据集有30个特征,分为良性和恶性两类。向后消除将特征从30个特征减少到13个特征,从而提高了两种分类模型的性能。采用后向消去特征选择和线性核支持向量机进行分类,准确率从96.14%提高到97.02%,精密度从98.06%提高到99.49%,召回率从90.48%提高到92.38%,AUC从0.95提高到0.96。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
20
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信