{"title":"Breast Cancer Classification Using an Extreme Gradient Boosting Model with F-Score Feature Selection Technique","authors":"Tina Elizabeth Mathew","doi":"10.12720/jait.14.2.363-372","DOIUrl":null,"url":null,"abstract":"—Breast cancer is considered the most problematic of all cancers affecting women. With high incidence and mortality rates, it is ranked as the primary and most significant health hazard for women globally. Early detection of the disease is the key to ensure the survival of the patient. Several medical techniques comprising of Mammography, Magnetic Resonance Imaging, Thermography and many more are available to detect the disease. But these techniques create much stress and pain, besides employing harmful rays for detection, to the patient undergoing them. Hence for early detection other categories of techniques can be implemented. Machine-learning assisted detection and classification is one such alternative. In this paper a hyper parameter optimized extreme gradient boosting model implemented along with F-Score feature selection is proposed and the model is used for classification of the breast tumor as either malignant or benign on the Wisconsin Breast Cancer dataset. The implementation of feature importance is investigated using F-Score and this is used for selecting the most relevant features that influence the target variable and classification is based on this. Experimentation is done using different training-testing partitions and the best performance of 99.27% accuracy score was shown by the 80−20 partition by the proposed XGBoost and F-Score Model.","PeriodicalId":36452,"journal":{"name":"Journal of Advances in Information Technology","volume":"1 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advances in Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12720/jait.14.2.363-372","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 3
Abstract
—Breast cancer is considered the most problematic of all cancers affecting women. With high incidence and mortality rates, it is ranked as the primary and most significant health hazard for women globally. Early detection of the disease is the key to ensure the survival of the patient. Several medical techniques comprising of Mammography, Magnetic Resonance Imaging, Thermography and many more are available to detect the disease. But these techniques create much stress and pain, besides employing harmful rays for detection, to the patient undergoing them. Hence for early detection other categories of techniques can be implemented. Machine-learning assisted detection and classification is one such alternative. In this paper a hyper parameter optimized extreme gradient boosting model implemented along with F-Score feature selection is proposed and the model is used for classification of the breast tumor as either malignant or benign on the Wisconsin Breast Cancer dataset. The implementation of feature importance is investigated using F-Score and this is used for selecting the most relevant features that influence the target variable and classification is based on this. Experimentation is done using different training-testing partitions and the best performance of 99.27% accuracy score was shown by the 80−20 partition by the proposed XGBoost and F-Score Model.