A Novel Approach to Perform Analysis and Prediction on Breast Cancer Dataset using R

S. M. Basha, D. Rajput, N. Iyengar, Ronnie D. Caytiles
{"title":"A Novel Approach to Perform Analysis and Prediction on Breast Cancer Dataset using R","authors":"S. M. Basha, D. Rajput, N. Iyengar, Ronnie D. Caytiles","doi":"10.14257/IJGDC.2018.11.2.05","DOIUrl":null,"url":null,"abstract":"Screening shows impact on cancer mortality rate by decreasing the number of advanced cancers with poor diagnosis, while cancer treatment works through decreasing the case-fatality rate. The prediction of breast cancer survivability has been a challenging research problem for many researchers. The objective of this research work is to propose a Novel model that can analysis the Breast cancer data and do efficient prediction. The contributions made in this paper are as follows, we collected three different the dataset from UCI Machine Learning repositories. We propose an approach, where a detailed comparison made between feature selection algorithms. Trained the datasets using Decision Tree, Random Forest and Support vector machine (SVM) machine learning algorithms. An attempt made to understand the impact of model selection metric in predicting different classes of Brest cancer. The results indicated that the Random forest is the best predictor wit 0.98 accuracy on the holdout sample, SVM came out to be the second with 0.97 accuracy and the Decision Tree came out with 0.96 to be the worst of the four condition tree with 0.95 accuracy. Finally performed prediction using Neural Network with three hidden layers and measured the efficiency, using Root Mean Square Error (RMSE) along with its variations.","PeriodicalId":46000,"journal":{"name":"International Journal of Grid and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Grid and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14257/IJGDC.2018.11.2.05","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

Screening shows impact on cancer mortality rate by decreasing the number of advanced cancers with poor diagnosis, while cancer treatment works through decreasing the case-fatality rate. The prediction of breast cancer survivability has been a challenging research problem for many researchers. The objective of this research work is to propose a Novel model that can analysis the Breast cancer data and do efficient prediction. The contributions made in this paper are as follows, we collected three different the dataset from UCI Machine Learning repositories. We propose an approach, where a detailed comparison made between feature selection algorithms. Trained the datasets using Decision Tree, Random Forest and Support vector machine (SVM) machine learning algorithms. An attempt made to understand the impact of model selection metric in predicting different classes of Brest cancer. The results indicated that the Random forest is the best predictor wit 0.98 accuracy on the holdout sample, SVM came out to be the second with 0.97 accuracy and the Decision Tree came out with 0.96 to be the worst of the four condition tree with 0.95 accuracy. Finally performed prediction using Neural Network with three hidden layers and measured the efficiency, using Root Mean Square Error (RMSE) along with its variations.
一种利用R对癌症数据集进行分析和预测的新方法
筛查通过减少诊断不良的晚期癌症数量对癌症死亡率产生影响,而癌症治疗通过降低病死率发挥作用。癌症生存能力的预测一直是许多研究人员面临的一个具有挑战性的研究问题。本研究工作的目的是提出一种新的模型,可以分析癌症数据并进行有效的预测。本文的贡献如下,我们从UCI机器学习库中收集了三个不同的数据集。我们提出了一种方法,其中对特征选择算法进行了详细的比较。使用决策树、随机森林和支持向量机(SVM)机器学习算法对数据集进行训练。试图了解模型选择指标在预测不同类别的布雷斯特癌症中的影响。结果表明,随机森林是抵抗样本的最佳预测因子,其准确度为0.98,SVM以0.97的准确度位居第二,决策树以0.96的准确度在四个条件树中最差,其准确率为0.95。最后,使用具有三个隐藏层的神经网络进行预测,并使用均方根误差(RMSE)及其变化来测量效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Grid and Distributed Computing
International Journal of Grid and Distributed Computing COMPUTER SCIENCE, SOFTWARE ENGINEERING-
自引率
0.00%
发文量
0
期刊介绍: IJGDC aims to facilitate and support research related to control and automation technology and its applications. Our Journal provides a chance for academic and industry professionals to discuss recent progress in the area of control and automation. To bridge the gap of users who do not have access to major databases where one should pay for every downloaded article; this online publication platform is open to all readers as part of our commitment to global scientific society. Journal Topics: -Architectures and Fabrics -Autonomic and Adaptive Systems -Cluster and Grid Integration -Creation and Management of Virtual Enterprises and Organizations -Dependable and Survivable Distributed Systems -Distributed and Large-Scale Data Access and Management -Distributed Multimedia Systems -Distributed Trust Management -eScience and eBusiness Applications -Fuzzy Algorithm -Grid Economy and Business Models -Histogram Methodology -Image or Speech Filtering -Image or Speech Recognition -Information Services -Large-Scale Group Communication -Metadata, Ontologies, and Provenance -Middleware and Toolkits -Monitoring, Management and Organization Tools -Networking and Security -Novel Distributed Applications -Performance Measurement and Modeling -Pervasive Computing -Problem Solving Environments -Programming Models, Tools and Environments -QoS and resource management -Real-time and Embedded Systems -Security and Trust in Grid and Distributed Systems -Sensor Networks -Utility Computing on Global Grids -Web Services and Service-Oriented Architecture -Wireless and Mobile Ad Hoc Networks -Workflow and Multi-agent Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信