基于人工神经网络和决策树的乳腺癌分类机器学习

2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON) Pub Date : 2020-11-04 DOI:10.1109/IEMCON51383.2020.9284936

Reetodeep Hazra, Megha Banerjee, L. Badia

{"title":"基于人工神经网络和决策树的乳腺癌分类机器学习","authors":"Reetodeep Hazra, Megha Banerjee, L. Badia","doi":"10.1109/IEMCON51383.2020.9284936","DOIUrl":null,"url":null,"abstract":"Breast cancer is one of the commonest cause of cancer deaths in women. It starts developing when threatening bumps start forming from the breast cells, and unfortunately most diagnoses happen in later stages, thus resulting in low chances of survival for the patient. So for early detection and prognosis, it is necessary to detect the benign or threatening nature of the bumps. In this paper, Artificial Neural Networks (ANN) and Decision Tree (DT) classifiers are used to develop a machine learning (ML) model using the Wisconsin diagnostic breast cancer (WDBC) dataset, in order to evaluate the attributes of a breast cancer development at beginning phases and classify it as malignant or benign. In the proposed scheme, feature selection and feature extraction are done to extract statistical features from the dataset and comparison between the models is provided based on their performance to identify the most suitable approach for diagnosis. The dataset apportioned into various arrangements of train-test split. The presentation of the framework is estimated, depending on accuracy, sensitivity, specificity, precision, and recall. The binary classification problem achieved a maximum accuracy of 98.55%.","PeriodicalId":6871,"journal":{"name":"2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)","volume":"23 1","pages":"0522-0527"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Machine Learning for Breast Cancer Classification With ANN and Decision Tree\",\"authors\":\"Reetodeep Hazra, Megha Banerjee, L. Badia\",\"doi\":\"10.1109/IEMCON51383.2020.9284936\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Breast cancer is one of the commonest cause of cancer deaths in women. It starts developing when threatening bumps start forming from the breast cells, and unfortunately most diagnoses happen in later stages, thus resulting in low chances of survival for the patient. So for early detection and prognosis, it is necessary to detect the benign or threatening nature of the bumps. In this paper, Artificial Neural Networks (ANN) and Decision Tree (DT) classifiers are used to develop a machine learning (ML) model using the Wisconsin diagnostic breast cancer (WDBC) dataset, in order to evaluate the attributes of a breast cancer development at beginning phases and classify it as malignant or benign. In the proposed scheme, feature selection and feature extraction are done to extract statistical features from the dataset and comparison between the models is provided based on their performance to identify the most suitable approach for diagnosis. The dataset apportioned into various arrangements of train-test split. The presentation of the framework is estimated, depending on accuracy, sensitivity, specificity, precision, and recall. The binary classification problem achieved a maximum accuracy of 98.55%.\",\"PeriodicalId\":6871,\"journal\":{\"name\":\"2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)\",\"volume\":\"23 1\",\"pages\":\"0522-0527\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IEMCON51383.2020.9284936\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEMCON51383.2020.9284936","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

摘要

乳腺癌是女性癌症死亡的最常见原因之一。当乳房细胞开始形成具有威胁性的肿块时，它就开始发展，不幸的是，大多数诊断都发生在晚期，因此导致患者的生存机会很低。因此，为了早期发现和预后，有必要检测肿块的良性或威胁性。在本文中，使用人工神经网络(ANN)和决策树(DT)分类器使用威斯康星州诊断乳腺癌(WDBC)数据集开发机器学习(ML)模型，以便在开始阶段评估乳腺癌发展的属性并将其分类为恶性或良性。在该方案中，通过特征选择和特征提取从数据集中提取统计特征，并根据模型的性能进行比较，以确定最适合的诊断方法。将数据集划分为不同的训练-测试分割。根据准确性、灵敏度、特异性、精密度和召回率对框架的呈现进行估计。二值分类问题的最大准确率为98.55%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Machine Learning for Breast Cancer Classification With ANN and Decision Tree

Breast cancer is one of the commonest cause of cancer deaths in women. It starts developing when threatening bumps start forming from the breast cells, and unfortunately most diagnoses happen in later stages, thus resulting in low chances of survival for the patient. So for early detection and prognosis, it is necessary to detect the benign or threatening nature of the bumps. In this paper, Artificial Neural Networks (ANN) and Decision Tree (DT) classifiers are used to develop a machine learning (ML) model using the Wisconsin diagnostic breast cancer (WDBC) dataset, in order to evaluate the attributes of a breast cancer development at beginning phases and classify it as malignant or benign. In the proposed scheme, feature selection and feature extraction are done to extract statistical features from the dataset and comparison between the models is provided based on their performance to identify the most suitable approach for diagnosis. The dataset apportioned into various arrangements of train-test split. The presentation of the framework is estimated, depending on accuracy, sensitivity, specificity, precision, and recall. The binary classification problem achieved a maximum accuracy of 98.55%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)

自引率

0.00%

发文量