{"title":"极端梯度增强与合成少数派过采样技术改进乳腺癌预测","authors":"Alexa Xyrel Rey, Aljhen Wahiman, Ferriel Atasan, Gernel S. Lumacad, Shaina Claire Bustamante, Ravien Glanida","doi":"10.1109/APSIT58554.2023.10201666","DOIUrl":null,"url":null,"abstract":"Breast cancer is one major contributor to global mortality and the second-leading reason of cancer deaths in women worldwide. Early prediction of breast cancer plays a vital part in improving patient's survival outcome by examining tumors whether malignant or benign. In this paper, the researchers formulated a machine learning (ML) classifier based on an ensemble learning called extreme gradient boosting (XGBoost) algorithm in predicting a benign or malignant (cancerous) tumor. The researchers integrated the synthetic minority oversampling technique (SMOTE) to resolve the class imbalance problem found in the dataset. Data-set utilized in this study are clinical cases of patients from the University of Wisconsin Hospitals. Experimental results showed that the proposed approach yielded better performance as compared to methods used in previous literature's, with an accuracy of 98.87%, a kappa statistic of 0.9774, and an f - score of 0.9890. Further, feature importance analysis showed that, among all input features, ‘Bare Nuclei’ variable contributed the greatest predictive power in classifying a malignant or benign tumor. This result is consistent with previous literature's, which emphasizes that Bare nuclei are typically seen in benign tumors as compared to malignant tumors.","PeriodicalId":170044,"journal":{"name":"2023 International Conference in Advances in Power, Signal, and Information Technology (APSIT)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Extreme Gradient Boosting with Synthetic Minority Over Sampling Technique for an Improved Breast Cancer Prediction\",\"authors\":\"Alexa Xyrel Rey, Aljhen Wahiman, Ferriel Atasan, Gernel S. Lumacad, Shaina Claire Bustamante, Ravien Glanida\",\"doi\":\"10.1109/APSIT58554.2023.10201666\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Breast cancer is one major contributor to global mortality and the second-leading reason of cancer deaths in women worldwide. Early prediction of breast cancer plays a vital part in improving patient's survival outcome by examining tumors whether malignant or benign. In this paper, the researchers formulated a machine learning (ML) classifier based on an ensemble learning called extreme gradient boosting (XGBoost) algorithm in predicting a benign or malignant (cancerous) tumor. The researchers integrated the synthetic minority oversampling technique (SMOTE) to resolve the class imbalance problem found in the dataset. Data-set utilized in this study are clinical cases of patients from the University of Wisconsin Hospitals. Experimental results showed that the proposed approach yielded better performance as compared to methods used in previous literature's, with an accuracy of 98.87%, a kappa statistic of 0.9774, and an f - score of 0.9890. Further, feature importance analysis showed that, among all input features, ‘Bare Nuclei’ variable contributed the greatest predictive power in classifying a malignant or benign tumor. This result is consistent with previous literature's, which emphasizes that Bare nuclei are typically seen in benign tumors as compared to malignant tumors.\",\"PeriodicalId\":170044,\"journal\":{\"name\":\"2023 International Conference in Advances in Power, Signal, and Information Technology (APSIT)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference in Advances in Power, Signal, and Information Technology (APSIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APSIT58554.2023.10201666\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference in Advances in Power, Signal, and Information Technology (APSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIT58554.2023.10201666","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Extreme Gradient Boosting with Synthetic Minority Over Sampling Technique for an Improved Breast Cancer Prediction
Breast cancer is one major contributor to global mortality and the second-leading reason of cancer deaths in women worldwide. Early prediction of breast cancer plays a vital part in improving patient's survival outcome by examining tumors whether malignant or benign. In this paper, the researchers formulated a machine learning (ML) classifier based on an ensemble learning called extreme gradient boosting (XGBoost) algorithm in predicting a benign or malignant (cancerous) tumor. The researchers integrated the synthetic minority oversampling technique (SMOTE) to resolve the class imbalance problem found in the dataset. Data-set utilized in this study are clinical cases of patients from the University of Wisconsin Hospitals. Experimental results showed that the proposed approach yielded better performance as compared to methods used in previous literature's, with an accuracy of 98.87%, a kappa statistic of 0.9774, and an f - score of 0.9890. Further, feature importance analysis showed that, among all input features, ‘Bare Nuclei’ variable contributed the greatest predictive power in classifying a malignant or benign tumor. This result is consistent with previous literature's, which emphasizes that Bare nuclei are typically seen in benign tumors as compared to malignant tumors.