乳腺癌分类使用SVM算法与内核RBF,线性和Sigmoid

Ginanjar Abdurrahman
{"title":"乳腺癌分类使用SVM算法与内核RBF,线性和Sigmoid","authors":"Ginanjar Abdurrahman","doi":"10.35316/justify.v2i1.3370","DOIUrl":null,"url":null,"abstract":"Breast cancer ranks first in both the gender category and the death rate. Late treatment is often found in cases of breast cancer which causes an increase in the risk factors for this cancer. For this reason, early detection of breast cancer is needed, so that treatment can be done in a timely manner, so that the death rate due to breast cancer can be reduced. For this reason, this article offers early detection of breast cancer using classification. The dataset in this study used the Wisconsin breast cancer dataset taken from Kaggle. Initially the dataset has a missing value, besides that the categorical data is not yet in numerical form, so it is necessary to do preprocessing with the missing value imputing technique and encoding to convert categorical data into numeric data. The dataset is divided into two proportions, namely 80% as training data and 20% as testing data. In the classification process, datasets that have been preprocessed are classified using SVM with three different kernels, namely the linear kernel, the RBF kernel, and the Sigmoid kernel. Based on the research results that have been obtained, the linear kernel shows the best classification results when applied to the SVM classification, namely with an accuracy value of up to 99%, followed by RBF kernel performance with an accuracy rate of 92%, and finally the sigmoid kernel with an accuracy value of 41%","PeriodicalId":240069,"journal":{"name":"JUSTIFY : Jurnal Sistem Informasi Ibrahimy","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Klasifikasi Kanker Payudara Menggunakan Algoritma SVM dengan Kernel RBF, Linier, dan Sigmoid\",\"authors\":\"Ginanjar Abdurrahman\",\"doi\":\"10.35316/justify.v2i1.3370\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Breast cancer ranks first in both the gender category and the death rate. Late treatment is often found in cases of breast cancer which causes an increase in the risk factors for this cancer. For this reason, early detection of breast cancer is needed, so that treatment can be done in a timely manner, so that the death rate due to breast cancer can be reduced. For this reason, this article offers early detection of breast cancer using classification. The dataset in this study used the Wisconsin breast cancer dataset taken from Kaggle. Initially the dataset has a missing value, besides that the categorical data is not yet in numerical form, so it is necessary to do preprocessing with the missing value imputing technique and encoding to convert categorical data into numeric data. The dataset is divided into two proportions, namely 80% as training data and 20% as testing data. In the classification process, datasets that have been preprocessed are classified using SVM with three different kernels, namely the linear kernel, the RBF kernel, and the Sigmoid kernel. Based on the research results that have been obtained, the linear kernel shows the best classification results when applied to the SVM classification, namely with an accuracy value of up to 99%, followed by RBF kernel performance with an accuracy rate of 92%, and finally the sigmoid kernel with an accuracy value of 41%\",\"PeriodicalId\":240069,\"journal\":{\"name\":\"JUSTIFY : Jurnal Sistem Informasi Ibrahimy\",\"volume\":\"68 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JUSTIFY : Jurnal Sistem Informasi Ibrahimy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.35316/justify.v2i1.3370\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JUSTIFY : Jurnal Sistem Informasi Ibrahimy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35316/justify.v2i1.3370","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

乳腺癌在性别类别和死亡率方面都排名第一。在乳腺癌病例中经常发现晚期治疗,这导致患这种癌症的风险因素增加。因此,需要早期发现乳腺癌,以便及时进行治疗,从而降低因乳腺癌导致的死亡率。出于这个原因,本文提供了使用分类方法早期检测乳腺癌的方法。本研究的数据集使用了来自Kaggle的威斯康星州乳腺癌数据集。由于数据集初始存在缺失值,而且分类数据还不是数值形式,因此需要使用缺失值输入技术和编码进行预处理,将分类数据转换为数值数据。数据集分为两部分,80%为训练数据,20%为测试数据。在分类过程中,使用具有三种不同核的SVM对经过预处理的数据集进行分类,即线性核、RBF核和Sigmoid核。根据已有的研究结果,线性核在SVM分类中表现出最好的分类效果,准确率高达99%,其次是RBF核,准确率为92%,最后是sigmoid核,准确率为41%
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Klasifikasi Kanker Payudara Menggunakan Algoritma SVM dengan Kernel RBF, Linier, dan Sigmoid
Breast cancer ranks first in both the gender category and the death rate. Late treatment is often found in cases of breast cancer which causes an increase in the risk factors for this cancer. For this reason, early detection of breast cancer is needed, so that treatment can be done in a timely manner, so that the death rate due to breast cancer can be reduced. For this reason, this article offers early detection of breast cancer using classification. The dataset in this study used the Wisconsin breast cancer dataset taken from Kaggle. Initially the dataset has a missing value, besides that the categorical data is not yet in numerical form, so it is necessary to do preprocessing with the missing value imputing technique and encoding to convert categorical data into numeric data. The dataset is divided into two proportions, namely 80% as training data and 20% as testing data. In the classification process, datasets that have been preprocessed are classified using SVM with three different kernels, namely the linear kernel, the RBF kernel, and the Sigmoid kernel. Based on the research results that have been obtained, the linear kernel shows the best classification results when applied to the SVM classification, namely with an accuracy value of up to 99%, followed by RBF kernel performance with an accuracy rate of 92%, and finally the sigmoid kernel with an accuracy value of 41%
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信