{"title":"Performance Analysis of Supervised Classifiers Using PCA Based Techniques on Breast Cancer","authors":"Zohaib Mushtaq, Akbari Yaqub, Ali Hassan, S. Su","doi":"10.1109/CEET1.2019.8711868","DOIUrl":null,"url":null,"abstract":"Focus of this paper is to recognize tumorous (malignant) and non-tumorous (benign) from the dataset. Wisconsin breast cancer data (WBCD) has been used and taken from UCI machine learning repository. Most popular supervised learning classifiers with PCA based dimensionality rebate techniques applied. Support Vector Machine, K Nearest Neighbor, Decision Tree, Naïve Bayes and Logistic Regression used with Linear, Sigmoid, Cosine, Poly and Radial basis function based PCA's. Numerous performance metrics tested after getting confusion matrix. Among them accuracy, sensitivity, specificity, false positive rate, false omission rate, precision, prevalence, f1-score, negative predicted value, false negative rate, false discovery rate and markedness. Our best performing models then relatively compared with other existing models. Sigmoid based Naïve Bayes exhibits best accuracy of 99.20%.K Nearest Neighbor also illustrate superb performance with all kernel PCA based techniques. Accuracy ranges from 96.4% to 97.8%","PeriodicalId":207523,"journal":{"name":"2019 International Conference on Engineering and Emerging Technologies (ICEET)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Engineering and Emerging Technologies (ICEET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEET1.2019.8711868","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Focus of this paper is to recognize tumorous (malignant) and non-tumorous (benign) from the dataset. Wisconsin breast cancer data (WBCD) has been used and taken from UCI machine learning repository. Most popular supervised learning classifiers with PCA based dimensionality rebate techniques applied. Support Vector Machine, K Nearest Neighbor, Decision Tree, Naïve Bayes and Logistic Regression used with Linear, Sigmoid, Cosine, Poly and Radial basis function based PCA's. Numerous performance metrics tested after getting confusion matrix. Among them accuracy, sensitivity, specificity, false positive rate, false omission rate, precision, prevalence, f1-score, negative predicted value, false negative rate, false discovery rate and markedness. Our best performing models then relatively compared with other existing models. Sigmoid based Naïve Bayes exhibits best accuracy of 99.20%.K Nearest Neighbor also illustrate superb performance with all kernel PCA based techniques. Accuracy ranges from 96.4% to 97.8%