{"title":"Impact of Classification Algorithms on Cervical Cancer Dataset","authors":"N. Sangavi, V. R. Kiruthika, K. Premalatha","doi":"10.1109/ICSCDS53736.2022.9760715","DOIUrl":null,"url":null,"abstract":"Recently, data mining has been used in wide range of domains to gain the knowledge from the insights present in the datasets. In medical field, Cancer is the most effective disease that has been spread across the world. Particularly, cervical cancer is a cancer that happens mostly in women. In order to analyze the symptoms most effectively and to prevent cancer, the analysis of cervical cancer in women has been done using classification algorithms such as neural network, decision tree, random forest, SVM and linear regression algorithm. Data preprocessing and feature selection has been done with the features present in the dataset. The performance of the classification algorithms has been measured by the performance measures such as accuracy specificity, sensitivity, recall and F-measure. Based on the confusion matrix values such as true positive, true negative, false positive and false negative values, the performance measures such as accuracy specificity, sensitivity, recall and F-measure has been calculated. The target variable of the cervical cancer dataset is whether the person affected by cervical cancer or not. The analysis of the cervical cancer has been done with models and based on the performance measure calculated for each models brings out the Random Forest as the best suited model with 80% accuracy among the other models.","PeriodicalId":433549,"journal":{"name":"2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSCDS53736.2022.9760715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, data mining has been used in wide range of domains to gain the knowledge from the insights present in the datasets. In medical field, Cancer is the most effective disease that has been spread across the world. Particularly, cervical cancer is a cancer that happens mostly in women. In order to analyze the symptoms most effectively and to prevent cancer, the analysis of cervical cancer in women has been done using classification algorithms such as neural network, decision tree, random forest, SVM and linear regression algorithm. Data preprocessing and feature selection has been done with the features present in the dataset. The performance of the classification algorithms has been measured by the performance measures such as accuracy specificity, sensitivity, recall and F-measure. Based on the confusion matrix values such as true positive, true negative, false positive and false negative values, the performance measures such as accuracy specificity, sensitivity, recall and F-measure has been calculated. The target variable of the cervical cancer dataset is whether the person affected by cervical cancer or not. The analysis of the cervical cancer has been done with models and based on the performance measure calculated for each models brings out the Random Forest as the best suited model with 80% accuracy among the other models.