{"title":"使用机器学习方法进行乳腺癌数据分析","authors":"Sidhant Mallick, Rasmita Dash, Rajashree Dash, Rasmita Rautray","doi":"10.1109/APSIT52773.2021.9641294","DOIUrl":null,"url":null,"abstract":"One of the leading cause of death is cancer. Lung cancer is the most common cancer and breast cancer is the second common cancer found in women. Thus sophisticated techniques must be designed to deal with these patients or the data generated from these patients. This system focuses on prediction of breast cancer where it categorizes the tumor as malignant or benign. Specialized machine learning algorithms have been used for creating models like decision trees, logistic regression, random forest, naive Bayes, Support vector machine along with Artificial neural networks which are applied on preprocessed data. Preprocessing of the data was done to check for inadequacies such as missing or null data points, categorical data for variables to contain label value rather than numeric, splitting of data set so as to have training and testing set and feature scaling to put our data set in range. Furthermore, dimensionality reduction methods were used in some datasets to improve the accuracy of the models. Artificial neural networks were used with different optimizers to check for the best performance.","PeriodicalId":436488,"journal":{"name":"2021 International Conference in Advances in Power, Signal, and Information Technology (APSIT)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Breast Cancer Data Analysis using Machine Learning Approaches\",\"authors\":\"Sidhant Mallick, Rasmita Dash, Rajashree Dash, Rasmita Rautray\",\"doi\":\"10.1109/APSIT52773.2021.9641294\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the leading cause of death is cancer. Lung cancer is the most common cancer and breast cancer is the second common cancer found in women. Thus sophisticated techniques must be designed to deal with these patients or the data generated from these patients. This system focuses on prediction of breast cancer where it categorizes the tumor as malignant or benign. Specialized machine learning algorithms have been used for creating models like decision trees, logistic regression, random forest, naive Bayes, Support vector machine along with Artificial neural networks which are applied on preprocessed data. Preprocessing of the data was done to check for inadequacies such as missing or null data points, categorical data for variables to contain label value rather than numeric, splitting of data set so as to have training and testing set and feature scaling to put our data set in range. Furthermore, dimensionality reduction methods were used in some datasets to improve the accuracy of the models. Artificial neural networks were used with different optimizers to check for the best performance.\",\"PeriodicalId\":436488,\"journal\":{\"name\":\"2021 International Conference in Advances in Power, Signal, and Information Technology (APSIT)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference in Advances in Power, Signal, and Information Technology (APSIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APSIT52773.2021.9641294\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference in Advances in Power, Signal, and Information Technology (APSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIT52773.2021.9641294","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Breast Cancer Data Analysis using Machine Learning Approaches
One of the leading cause of death is cancer. Lung cancer is the most common cancer and breast cancer is the second common cancer found in women. Thus sophisticated techniques must be designed to deal with these patients or the data generated from these patients. This system focuses on prediction of breast cancer where it categorizes the tumor as malignant or benign. Specialized machine learning algorithms have been used for creating models like decision trees, logistic regression, random forest, naive Bayes, Support vector machine along with Artificial neural networks which are applied on preprocessed data. Preprocessing of the data was done to check for inadequacies such as missing or null data points, categorical data for variables to contain label value rather than numeric, splitting of data set so as to have training and testing set and feature scaling to put our data set in range. Furthermore, dimensionality reduction methods were used in some datasets to improve the accuracy of the models. Artificial neural networks were used with different optimizers to check for the best performance.