{"title":"On breast cancer detection: an application of machine learning algorithms on the wisconsin diagnostic dataset","authors":"Abien Fred Agarap","doi":"10.1145/3184066.3184080","DOIUrl":null,"url":null,"abstract":"This paper presents a comparison of six machine learning (ML) algorithms: GRU-SVM[1], Linear Regression, Multilayer Perceptron (MLP), Nearest Neighbor (NN) search, Softmax Regression, and Support Vector Machine (SVM) on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset[2] by measuring their classification test accuracy, and their sensitivity and specificity values. The said dataset consists of features which were computed from digitized images of FNA tests on a breast mass[2]. For the implementation of the ML algorithms, the dataset was partitioned in the following fashion: 70% for training phase, and 30% for the testing phase. The hyper-parameters used for all the classifiers were manually assigned. Results show that all the presented ML algorithms performed well (all exceeded 90% test accuracy) on the classification task. The MLP algorithm stands out among the implemented algorithms with a test accuracy of ≈99.04%.","PeriodicalId":109559,"journal":{"name":"International Conference on Machine Learning and Soft Computing","volume":"114 5-6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"186","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Machine Learning and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3184066.3184080","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 186
Abstract
This paper presents a comparison of six machine learning (ML) algorithms: GRU-SVM[1], Linear Regression, Multilayer Perceptron (MLP), Nearest Neighbor (NN) search, Softmax Regression, and Support Vector Machine (SVM) on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset[2] by measuring their classification test accuracy, and their sensitivity and specificity values. The said dataset consists of features which were computed from digitized images of FNA tests on a breast mass[2]. For the implementation of the ML algorithms, the dataset was partitioned in the following fashion: 70% for training phase, and 30% for the testing phase. The hyper-parameters used for all the classifiers were manually assigned. Results show that all the presented ML algorithms performed well (all exceeded 90% test accuracy) on the classification task. The MLP algorithm stands out among the implemented algorithms with a test accuracy of ≈99.04%.