使用监督式机器学习技术预测乳腺癌

IF 1.1 Q3 INFORMATION SCIENCE & LIBRARY SCIENCE
P. Dadheech, Vijay H. Kalmani, S. R. Dogiwal, V. Sharma, Ankit Kumar, S. Pandey
{"title":"使用监督式机器学习技术预测乳腺癌","authors":"P. Dadheech, Vijay H. Kalmani, S. R. Dogiwal, V. Sharma, Ankit Kumar, S. Pandey","doi":"10.47974/jios-1348","DOIUrl":null,"url":null,"abstract":"Breast cancer is one of the most prevalent diseases in India’s urban regions and the second most common in the country’s rural parts. In India, a woman is diagnosed with breast cancer growth every four minutes, and a woman dies from breast cancer sickness every thirteen minutes. Over half of breast cancer patients in India are diagnosed with stage 3 or 4 illness, which has extremely low survival rates; hence, an urgent need exists for a rapid detection strategy. To forecast if a patient is at risk for breast cancer, we utilise the classification techniques of machine learning, in which the machine learning model learns from the previous information and can anticipate on the new information that is generated by the data. To create a model using Logistic Regression, Support Vector Machines, and Random Forests, this dataset was collected from the UCI repository and studied in this study. The primary goal is to improve the accuracy, precision, and sensitivity of all the algorithms that are used to categorise data in terms of the competency and viability of each and every algorithm. Random Forest has been shown to be the most accurate in classifying breast cancer, with a precision of 98.60 percent in tests. The Scientific Python Development Environment is used to carry out this machine learning study, which is written in the python programming language.","PeriodicalId":46518,"journal":{"name":"JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES","volume":"1 1","pages":""},"PeriodicalIF":1.1000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Breast cancer prediction using supervised machine learning techniques\",\"authors\":\"P. Dadheech, Vijay H. Kalmani, S. R. Dogiwal, V. Sharma, Ankit Kumar, S. Pandey\",\"doi\":\"10.47974/jios-1348\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Breast cancer is one of the most prevalent diseases in India’s urban regions and the second most common in the country’s rural parts. In India, a woman is diagnosed with breast cancer growth every four minutes, and a woman dies from breast cancer sickness every thirteen minutes. Over half of breast cancer patients in India are diagnosed with stage 3 or 4 illness, which has extremely low survival rates; hence, an urgent need exists for a rapid detection strategy. To forecast if a patient is at risk for breast cancer, we utilise the classification techniques of machine learning, in which the machine learning model learns from the previous information and can anticipate on the new information that is generated by the data. To create a model using Logistic Regression, Support Vector Machines, and Random Forests, this dataset was collected from the UCI repository and studied in this study. The primary goal is to improve the accuracy, precision, and sensitivity of all the algorithms that are used to categorise data in terms of the competency and viability of each and every algorithm. Random Forest has been shown to be the most accurate in classifying breast cancer, with a precision of 98.60 percent in tests. The Scientific Python Development Environment is used to carry out this machine learning study, which is written in the python programming language.\",\"PeriodicalId\":46518,\"journal\":{\"name\":\"JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.47974/jios-1348\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47974/jios-1348","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

乳腺癌是印度城市地区最常见的疾病之一,也是该国农村地区第二大常见疾病。在印度,每4分钟就有一名女性被诊断出患有乳腺癌,每13分钟就有一名女性死于乳腺癌。在印度,超过一半的乳腺癌患者被诊断为3期或4期,生存率极低;因此,迫切需要一种快速检测战略。为了预测患者是否有患乳腺癌的风险,我们利用机器学习的分类技术,其中机器学习模型从以前的信息中学习,并可以预测由数据生成的新信息。为了使用逻辑回归、支持向量机和随机森林来创建模型,本研究从UCI存储库中收集了该数据集并进行了研究。主要目标是根据每个算法的能力和可行性来提高用于对数据进行分类的所有算法的准确性、精度和灵敏度。随机森林已被证明是乳腺癌分类最准确的方法,在测试中准确率达到98.60%。本机器学习研究使用Scientific Python Development Environment进行,使用Python编程语言编写。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Breast cancer prediction using supervised machine learning techniques
Breast cancer is one of the most prevalent diseases in India’s urban regions and the second most common in the country’s rural parts. In India, a woman is diagnosed with breast cancer growth every four minutes, and a woman dies from breast cancer sickness every thirteen minutes. Over half of breast cancer patients in India are diagnosed with stage 3 or 4 illness, which has extremely low survival rates; hence, an urgent need exists for a rapid detection strategy. To forecast if a patient is at risk for breast cancer, we utilise the classification techniques of machine learning, in which the machine learning model learns from the previous information and can anticipate on the new information that is generated by the data. To create a model using Logistic Regression, Support Vector Machines, and Random Forests, this dataset was collected from the UCI repository and studied in this study. The primary goal is to improve the accuracy, precision, and sensitivity of all the algorithms that are used to categorise data in terms of the competency and viability of each and every algorithm. Random Forest has been shown to be the most accurate in classifying breast cancer, with a precision of 98.60 percent in tests. The Scientific Python Development Environment is used to carry out this machine learning study, which is written in the python programming language.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES
JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES INFORMATION SCIENCE & LIBRARY SCIENCE-
自引率
21.40%
发文量
88
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信