{"title":"基于多种机器学习算法的乳腺癌预测","authors":"Sheng Zhou, Chujiao Hu, Shanshan Wei, Xiaofan Yan","doi":"10.1177/15330338241234791","DOIUrl":null,"url":null,"abstract":"IntroductionThe incidence of breast cancer has steadily risen over the years owing to changes in lifestyle and environment. Presently, breast cancer is one of the primary causes of cancer-related deaths among women, making it a crucial global public health concern. Thus, the creation of an automated diagnostic system for breast cancer bears great importance in the medical community.ObjectivesThis study analyses the Wisconsin breast cancer dataset and develops a machine learning algorithm for accurately classifying breast cancer as benign or malignant.MethodsOur research is a retrospective study, and the main purpose is to develop a high-precision classification algorithm for benign and malignant breast cancer. To achieve this, we first preprocessed the dataset using standard techniques such as feature scaling and handling missing values. We assessed the normality of the data distribution initially, after which we opted for Spearman correlation analysis to examine the relationship between the feature subset data and the labeled data, considering the normality test results. We subsequently employed the Wilcoxon rank sum test to investigate the dissimilarities in distribution among various breast cancer feature data. We constructed the feature subset based on statistical results and trained 7 machine learning algorithms, specifically the decision tree, stochastic gradient descent algorithm, random forest algorithm, support vector machine algorithm, logistics algorithm, and AdaBoost algorithm.ResultsThe results of the evaluation indicated that the AdaBoost-Logistic algorithm achieved an accuracy of 99.12%, outperforming the other 6 algorithms and previous techniques.ConclusionThe constructed AdaBoost-Logistic algorithm exhibits significant precision with the Wisconsin breast cancer dataset, achieving commendable classification performance for both benign and malignant breast cancer cases.","PeriodicalId":22203,"journal":{"name":"Technology in Cancer Research & Treatment","volume":"25 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Breast Cancer Prediction Based on Multiple Machine Learning Algorithms\",\"authors\":\"Sheng Zhou, Chujiao Hu, Shanshan Wei, Xiaofan Yan\",\"doi\":\"10.1177/15330338241234791\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"IntroductionThe incidence of breast cancer has steadily risen over the years owing to changes in lifestyle and environment. Presently, breast cancer is one of the primary causes of cancer-related deaths among women, making it a crucial global public health concern. Thus, the creation of an automated diagnostic system for breast cancer bears great importance in the medical community.ObjectivesThis study analyses the Wisconsin breast cancer dataset and develops a machine learning algorithm for accurately classifying breast cancer as benign or malignant.MethodsOur research is a retrospective study, and the main purpose is to develop a high-precision classification algorithm for benign and malignant breast cancer. To achieve this, we first preprocessed the dataset using standard techniques such as feature scaling and handling missing values. We assessed the normality of the data distribution initially, after which we opted for Spearman correlation analysis to examine the relationship between the feature subset data and the labeled data, considering the normality test results. We subsequently employed the Wilcoxon rank sum test to investigate the dissimilarities in distribution among various breast cancer feature data. We constructed the feature subset based on statistical results and trained 7 machine learning algorithms, specifically the decision tree, stochastic gradient descent algorithm, random forest algorithm, support vector machine algorithm, logistics algorithm, and AdaBoost algorithm.ResultsThe results of the evaluation indicated that the AdaBoost-Logistic algorithm achieved an accuracy of 99.12%, outperforming the other 6 algorithms and previous techniques.ConclusionThe constructed AdaBoost-Logistic algorithm exhibits significant precision with the Wisconsin breast cancer dataset, achieving commendable classification performance for both benign and malignant breast cancer cases.\",\"PeriodicalId\":22203,\"journal\":{\"name\":\"Technology in Cancer Research & Treatment\",\"volume\":\"25 1\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-04-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Technology in Cancer Research & Treatment\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/15330338241234791\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Technology in Cancer Research & Treatment","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/15330338241234791","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
Breast Cancer Prediction Based on Multiple Machine Learning Algorithms
IntroductionThe incidence of breast cancer has steadily risen over the years owing to changes in lifestyle and environment. Presently, breast cancer is one of the primary causes of cancer-related deaths among women, making it a crucial global public health concern. Thus, the creation of an automated diagnostic system for breast cancer bears great importance in the medical community.ObjectivesThis study analyses the Wisconsin breast cancer dataset and develops a machine learning algorithm for accurately classifying breast cancer as benign or malignant.MethodsOur research is a retrospective study, and the main purpose is to develop a high-precision classification algorithm for benign and malignant breast cancer. To achieve this, we first preprocessed the dataset using standard techniques such as feature scaling and handling missing values. We assessed the normality of the data distribution initially, after which we opted for Spearman correlation analysis to examine the relationship between the feature subset data and the labeled data, considering the normality test results. We subsequently employed the Wilcoxon rank sum test to investigate the dissimilarities in distribution among various breast cancer feature data. We constructed the feature subset based on statistical results and trained 7 machine learning algorithms, specifically the decision tree, stochastic gradient descent algorithm, random forest algorithm, support vector machine algorithm, logistics algorithm, and AdaBoost algorithm.ResultsThe results of the evaluation indicated that the AdaBoost-Logistic algorithm achieved an accuracy of 99.12%, outperforming the other 6 algorithms and previous techniques.ConclusionThe constructed AdaBoost-Logistic algorithm exhibits significant precision with the Wisconsin breast cancer dataset, achieving commendable classification performance for both benign and malignant breast cancer cases.
期刊介绍:
Technology in Cancer Research & Treatment (TCRT) is a JCR-ranked, broad-spectrum, open access, peer-reviewed publication whose aim is to provide researchers and clinicians with a platform to share and discuss developments in the prevention, diagnosis, treatment, and monitoring of cancer.