Multi-omics-based Machine Learning for the Subtype Classification of Breast Cancer

IF 2.9 4区 综合性期刊 Q1 Multidisciplinary
Asmaa M. Hassan, Safaa M. Naeem, Mohamed A. A. Eldosoky, Mai S. Mabrouk
{"title":"Multi-omics-based Machine Learning for the Subtype Classification of Breast Cancer","authors":"Asmaa M. Hassan, Safaa M. Naeem, Mohamed A. A. Eldosoky, Mai S. Mabrouk","doi":"10.1007/s13369-024-09341-7","DOIUrl":null,"url":null,"abstract":"<p>Cancer is a complicated disease that produces deregulatory changes in cellular activities (such as proteins). Data from these levels must be integrated into multi-omics analyses to better understand cancer and its progression. Deep learning approaches have recently helped with multi-omics analysis of cancer data. Breast cancer is a prevalent form of cancer among women, resulting from a multitude of clinical, lifestyle, social, and economic factors. The goal of this study was to predict breast cancer using several machine learning methods. We applied the architecture for mono-omics data analysis of the Cancer Genome Atlas Breast Cancer datasets in our analytical investigation. The following classifiers were used: random forest, partial least squares, Naive Bayes, decision trees, neural networks, and Lasso regularization. They were used and evaluated using the area under the curve metric. The random forest classifier and the Lasso regularization classifier achieved the highest area under the curve values of 0.99 each. These areas under the curve values were obtained using the mono-omics data employed in this investigation. The random forest and Lasso regularization classifiers achieved the maximum prediction accuracy, showing that they are appropriate for this problem. For all mono-omics classification models used in this paper, random forest and Lasso regression offer the best results for all metrics (precision, recall, and F1 score). The integration of various risk factors in breast cancer prediction modeling can aid in early diagnosis and treatment, utilizing data collection, storage, and intelligent systems for disease management. The integration of diverse risk factors in breast cancer prediction modeling holds promise for early diagnosis and treatment. Leveraging data collection, storage, and intelligent systems can further enhance disease management strategies, ultimately contributing to improved patient outcomes.</p>","PeriodicalId":8109,"journal":{"name":"Arabian Journal for Science and Engineering","volume":"2 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arabian Journal for Science and Engineering","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1007/s13369-024-09341-7","RegionNum":4,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Multidisciplinary","Score":null,"Total":0}
引用次数: 0

Abstract

Cancer is a complicated disease that produces deregulatory changes in cellular activities (such as proteins). Data from these levels must be integrated into multi-omics analyses to better understand cancer and its progression. Deep learning approaches have recently helped with multi-omics analysis of cancer data. Breast cancer is a prevalent form of cancer among women, resulting from a multitude of clinical, lifestyle, social, and economic factors. The goal of this study was to predict breast cancer using several machine learning methods. We applied the architecture for mono-omics data analysis of the Cancer Genome Atlas Breast Cancer datasets in our analytical investigation. The following classifiers were used: random forest, partial least squares, Naive Bayes, decision trees, neural networks, and Lasso regularization. They were used and evaluated using the area under the curve metric. The random forest classifier and the Lasso regularization classifier achieved the highest area under the curve values of 0.99 each. These areas under the curve values were obtained using the mono-omics data employed in this investigation. The random forest and Lasso regularization classifiers achieved the maximum prediction accuracy, showing that they are appropriate for this problem. For all mono-omics classification models used in this paper, random forest and Lasso regression offer the best results for all metrics (precision, recall, and F1 score). The integration of various risk factors in breast cancer prediction modeling can aid in early diagnosis and treatment, utilizing data collection, storage, and intelligent systems for disease management. The integration of diverse risk factors in breast cancer prediction modeling holds promise for early diagnosis and treatment. Leveraging data collection, storage, and intelligent systems can further enhance disease management strategies, ultimately contributing to improved patient outcomes.

Abstract Image

基于多组学的机器学习用于乳腺癌亚型分类
癌症是一种复杂的疾病,会导致细胞活动(如蛋白质)发生脱节变化。必须将这些层面的数据整合到多组学分析中,才能更好地了解癌症及其进展。最近,深度学习方法为癌症数据的多组学分析提供了帮助。乳腺癌是女性中的一种常见癌症,由多种临床、生活方式、社会和经济因素导致。本研究的目标是使用多种机器学习方法预测乳腺癌。我们在分析调查中应用了癌症基因组图谱乳腺癌数据集的单组学数据分析架构。我们使用了以下分类器:随机森林、偏最小二乘、奈夫贝叶斯、决策树、神经网络和拉索正则化。我们使用曲线下面积指标对这些分类器进行了评估。随机森林分类器和 Lasso 正则化分类器的曲线下面积值最高,均为 0.99。这些曲线下面积值是使用本研究中使用的单组学数据获得的。随机森林分类器和 Lasso 正则化分类器的预测准确率最高,表明它们适用于这一问题。在本文使用的所有单组学分类模型中,随机森林和拉索回归在所有指标(精确度、召回率和 F1 分数)上都取得了最佳结果。在乳腺癌预测模型中整合各种风险因素有助于早期诊断和治疗,利用数据收集、存储和智能系统进行疾病管理。在乳腺癌预测建模中整合各种风险因素,有望实现早期诊断和治疗。利用数据收集、存储和智能系统可以进一步加强疾病管理策略,最终有助于改善患者的治疗效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Arabian Journal for Science and Engineering
Arabian Journal for Science and Engineering 综合性期刊-综合性期刊
CiteScore
5.20
自引率
3.40%
发文量
0
审稿时长
4.3 months
期刊介绍: King Fahd University of Petroleum & Minerals (KFUPM) partnered with Springer to publish the Arabian Journal for Science and Engineering (AJSE). AJSE, which has been published by KFUPM since 1975, is a recognized national, regional and international journal that provides a great opportunity for the dissemination of research advances from the Kingdom of Saudi Arabia, MENA and the world.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信