{"title":"Unlocking the Potential of Machine Learning for Accurate Diagnosis of Breast Cancer","authors":"Rinku Soni, Saeedah Zaina, Dr.Y.L Malathi Latha","doi":"10.1109/CONIT59222.2023.10205897","DOIUrl":null,"url":null,"abstract":"Breast cancer is a major health concern affecting women globally, and early detection is crucial for successful treatment. A promising strategy for enhancing breast cancer diagnosis accuracy and lowering diagnostic mistakes is machine learning. This research aims to enhance the accuracy of breast cancer diagnosis by utilizing balanced data and comparing different machine learning algorithms for classification with and without the use of feature selection methods. In this study, we utilized the Wisconsin Diagnostic Breast Cancer (WDBC) dataset, and to balance data both oversampling and undersampling techniques were utilized. We have used eight different classification models and five different feature selection techniques. We compared the performance of classifiers over undersampled and oversampled data, with and without feature selection. MLflow was utilized to monitor the effectiveness of algorithms and keep a record of their performance. Our results show that oversampling was more effective in improving the performance of our models compared to undersampling. When compared to other models, Logistic Regression achieved the highest accuracy on the oversampled data without feature selection. Our research showed that incorporating feature selection results in slightly lower accuracy compared to the base model which means that the results were not significant enough to compensate for the information loss caused by removing certain features. The study underscores the efficacy of machine learning in the diagnosis of breast cancer and draws attention to the potential of machine learning algorithms in enhancing the accuracy of cancer detection.","PeriodicalId":377623,"journal":{"name":"2023 3rd International Conference on Intelligent Technologies (CONIT)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 3rd International Conference on Intelligent Technologies (CONIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CONIT59222.2023.10205897","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Breast cancer is a major health concern affecting women globally, and early detection is crucial for successful treatment. A promising strategy for enhancing breast cancer diagnosis accuracy and lowering diagnostic mistakes is machine learning. This research aims to enhance the accuracy of breast cancer diagnosis by utilizing balanced data and comparing different machine learning algorithms for classification with and without the use of feature selection methods. In this study, we utilized the Wisconsin Diagnostic Breast Cancer (WDBC) dataset, and to balance data both oversampling and undersampling techniques were utilized. We have used eight different classification models and five different feature selection techniques. We compared the performance of classifiers over undersampled and oversampled data, with and without feature selection. MLflow was utilized to monitor the effectiveness of algorithms and keep a record of their performance. Our results show that oversampling was more effective in improving the performance of our models compared to undersampling. When compared to other models, Logistic Regression achieved the highest accuracy on the oversampled data without feature selection. Our research showed that incorporating feature selection results in slightly lower accuracy compared to the base model which means that the results were not significant enough to compensate for the information loss caused by removing certain features. The study underscores the efficacy of machine learning in the diagnosis of breast cancer and draws attention to the potential of machine learning algorithms in enhancing the accuracy of cancer detection.