Identification of molecular signatures and pathways of obese breast cancer gene expression data by a machine learning algorithm

Betul Comertpay, E. Gov
{"title":"Identification of molecular signatures and pathways of obese breast cancer gene expression data by a machine learning algorithm","authors":"Betul Comertpay, E. Gov","doi":"10.20517/jtgg.2021.44","DOIUrl":null,"url":null,"abstract":"Aim: Currently, the obesity epidemic is one of the biggest problems for human health. Obesity is impacted on survival in patients with breast cancer. However, key biomarkers of obesity-related breast cancer risk are still not well known. Thus, using machine learning to identify the most appropriate features in obesity-associated breast cancer patients may improve the predictive accuracy and interpretability of regression models. Methods: In the present study, we identified 23 differentially expressed genes (DEGs) from the GSE24185 transcriptome dataset. Seed genes were identified from DEGs, the co-expression network genes and hub genes of the protein-protein interaction network. Pathway enrichment analysis was performed for DEGs. The Ridge penalty regression model was executed by using P-values of enriched pathways and seed gene pathway association score to obtain the most relevant molecular signatures. The model was performed using 10-fold cross-validation to fit the penalized models. Results: Angiotensin II receptor type 1 (AGTR1), cyclin D1 (CCND1), glutamate ionotropic receptor AMPA type subunit 2 (GRIA2), interleukin-6 cytokine family signal transducer (IL6ST), matrix metallopeptidase 9 (MMP9), and protein kinase CAMP-dependent type II regulatory subunit beta (PRKAR2B) were considered as candidate molecular signatures of obese patients with breast cancer. In addition, RAF-independent MAPK1/3 activation, collagen degradation, bladder cancer, drug metabolism-cytochrome P450, and signaling by Hedgehog pathways in cancer were primarily associated with obesity-associated breast cancer. Conclusion: These genes may be used for risk analysis of the disease progression of obese patients with breast cancer. Corresponding genes and pathways should be validated via experimental studies.","PeriodicalId":73999,"journal":{"name":"Journal of translational genetics and genomics","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of translational genetics and genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20517/jtgg.2021.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Aim: Currently, the obesity epidemic is one of the biggest problems for human health. Obesity is impacted on survival in patients with breast cancer. However, key biomarkers of obesity-related breast cancer risk are still not well known. Thus, using machine learning to identify the most appropriate features in obesity-associated breast cancer patients may improve the predictive accuracy and interpretability of regression models. Methods: In the present study, we identified 23 differentially expressed genes (DEGs) from the GSE24185 transcriptome dataset. Seed genes were identified from DEGs, the co-expression network genes and hub genes of the protein-protein interaction network. Pathway enrichment analysis was performed for DEGs. The Ridge penalty regression model was executed by using P-values of enriched pathways and seed gene pathway association score to obtain the most relevant molecular signatures. The model was performed using 10-fold cross-validation to fit the penalized models. Results: Angiotensin II receptor type 1 (AGTR1), cyclin D1 (CCND1), glutamate ionotropic receptor AMPA type subunit 2 (GRIA2), interleukin-6 cytokine family signal transducer (IL6ST), matrix metallopeptidase 9 (MMP9), and protein kinase CAMP-dependent type II regulatory subunit beta (PRKAR2B) were considered as candidate molecular signatures of obese patients with breast cancer. In addition, RAF-independent MAPK1/3 activation, collagen degradation, bladder cancer, drug metabolism-cytochrome P450, and signaling by Hedgehog pathways in cancer were primarily associated with obesity-associated breast cancer. Conclusion: These genes may be used for risk analysis of the disease progression of obese patients with breast cancer. Corresponding genes and pathways should be validated via experimental studies.
通过机器学习算法识别肥胖乳腺癌基因表达数据的分子特征和途径
目的:目前,肥胖是人类健康面临的最大问题之一。肥胖会影响乳腺癌患者的生存。然而,肥胖相关乳腺癌风险的关键生物标志物仍不为人所知。因此,使用机器学习来识别与肥胖相关的乳腺癌患者中最合适的特征可能会提高回归模型的预测准确性和可解释性。方法:在本研究中,我们从GSE24185转录组数据集中鉴定了23个差异表达基因(DEGs)。从deg、共表达网络基因和蛋白-蛋白相互作用网络枢纽基因中鉴定种子基因。对DEGs进行途径富集分析。利用富集通路的p值和种子基因通路关联评分进行Ridge惩罚回归模型,获得最相关的分子特征。模型使用10倍交叉验证来拟合惩罚模型。结果:血管紧张素II受体1型(AGTR1)、细胞周期蛋白D1 (CCND1)、谷氨酸嗜离子受体AMPA型亚基2 (GRIA2)、白细胞介素-6细胞因子家族信号转导器(IL6ST)、基质金属肽酶9 (MMP9)、蛋白激酶camp依赖性II型调节亚基β (PRKAR2B)被认为是肥胖乳腺癌患者的候选分子特征。此外,raf独立的MAPK1/3激活、胶原降解、膀胱癌、药物代谢-细胞色素P450以及癌症中Hedgehog通路的信号传导主要与肥胖相关的乳腺癌相关。结论:这些基因可用于肥胖乳腺癌患者疾病进展的风险分析。相应的基因和通路需要通过实验研究来验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.70
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信