基于多种机器学习算法构建的胃癌预后模型

IF 2.2 4区 生物学 Q3 CELL BIOLOGY
Xueli Yang, Xu Huang, Wang Ying, Tao Deng, Jun Zhang, Qianshan Ding
{"title":"基于多种机器学习算法构建的胃癌预后模型","authors":"Xueli Yang,&nbsp;Xu Huang,&nbsp;Wang Ying,&nbsp;Tao Deng,&nbsp;Jun Zhang,&nbsp;Qianshan Ding","doi":"10.1007/s10735-025-10629-7","DOIUrl":null,"url":null,"abstract":"<div><p>Gastric cancer (GC) is a highly heterogeneous disease that requires highly accurate prognostic models. Machine learning is a powerful tool for identifying predictive biomarkers and developing prognostic models. Here, we aim to integrate bioinformatics and machine learning algorithms to construct a risk model to predict prognosis of GC patients. Transcriptome data and clinical information of GC patients were obtained from the Cancer Genome Atlas (TCGA) database. Microarray data (GSE84437 and GSE26253) were obtained from the Gene Expression Omnibus (GEO) database. Univariate Cox regression analysis was used to screen prognostic genes. The risk genes closely related to prognosis were screened by machine learning algorithms and the risk score was calculated. Kaplan-Meier survival curve, time-dependent receiver operating characteristic (ROC) curve, univariate and multivariate Cox regression analysis were used to verify the validity of the risk model. The protein expression of hub genes in GC tissues was evaluated by immunohistochemical staining. 7 hub genes (CGB5, FEM1A, MATN3, ZNF101, MARCKS, BRI3BP and APOD) were identified and correlated with GC prognosis. A high-precision risk model based on random survival forest (RSF) and generalized boosted regression modelling (GBM) was constructed using these 7 hub genes. The risk model has good predictive ability for GC patients’ prognosis, and the risk score could be used as an independent prognostic factor for GC. In addition, the protein expression levels of CGB5, MATN3, MARCKS and APOD in GC tissues were significantly higher than those in normal tissues, and correlated with the pathological characteristics of GC patients. The risk model composed of 7 hub genes can accurately evaluate the prognosis of GC patients, which may contribute to the precise and personalized treatment of GC patients.</p></div>","PeriodicalId":650,"journal":{"name":"Journal of Molecular Histology","volume":"56 6","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A prognostic model for gastric cancer constructed by multiple machine learning algorithms\",\"authors\":\"Xueli Yang,&nbsp;Xu Huang,&nbsp;Wang Ying,&nbsp;Tao Deng,&nbsp;Jun Zhang,&nbsp;Qianshan Ding\",\"doi\":\"10.1007/s10735-025-10629-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Gastric cancer (GC) is a highly heterogeneous disease that requires highly accurate prognostic models. Machine learning is a powerful tool for identifying predictive biomarkers and developing prognostic models. Here, we aim to integrate bioinformatics and machine learning algorithms to construct a risk model to predict prognosis of GC patients. Transcriptome data and clinical information of GC patients were obtained from the Cancer Genome Atlas (TCGA) database. Microarray data (GSE84437 and GSE26253) were obtained from the Gene Expression Omnibus (GEO) database. Univariate Cox regression analysis was used to screen prognostic genes. The risk genes closely related to prognosis were screened by machine learning algorithms and the risk score was calculated. Kaplan-Meier survival curve, time-dependent receiver operating characteristic (ROC) curve, univariate and multivariate Cox regression analysis were used to verify the validity of the risk model. The protein expression of hub genes in GC tissues was evaluated by immunohistochemical staining. 7 hub genes (CGB5, FEM1A, MATN3, ZNF101, MARCKS, BRI3BP and APOD) were identified and correlated with GC prognosis. A high-precision risk model based on random survival forest (RSF) and generalized boosted regression modelling (GBM) was constructed using these 7 hub genes. The risk model has good predictive ability for GC patients’ prognosis, and the risk score could be used as an independent prognostic factor for GC. In addition, the protein expression levels of CGB5, MATN3, MARCKS and APOD in GC tissues were significantly higher than those in normal tissues, and correlated with the pathological characteristics of GC patients. The risk model composed of 7 hub genes can accurately evaluate the prognosis of GC patients, which may contribute to the precise and personalized treatment of GC patients.</p></div>\",\"PeriodicalId\":650,\"journal\":{\"name\":\"Journal of Molecular Histology\",\"volume\":\"56 6\",\"pages\":\"\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-10-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Molecular Histology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10735-025-10629-7\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"CELL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Molecular Histology","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10735-025-10629-7","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

胃癌(GC)是一种高度异质性的疾病,需要高度准确的预后模型。机器学习是识别预测性生物标志物和开发预后模型的强大工具。本研究旨在结合生物信息学和机器学习算法,构建预测胃癌患者预后的风险模型。胃癌患者的转录组数据和临床信息来自癌症基因组图谱(TCGA)数据库。微阵列数据(GSE84437和GSE26253)从Gene Expression Omnibus (GEO)数据库中获得。采用单因素Cox回归分析筛选预后基因。通过机器学习算法筛选与预后密切相关的风险基因,计算风险评分。采用Kaplan-Meier生存曲线、随时间变化的受试者工作特征(ROC)曲线、单因素和多因素Cox回归分析验证风险模型的有效性。免疫组化染色检测GC组织中hub基因的蛋白表达。7个中心基因(CGB5、FEM1A、MATN3、ZNF101、MARCKS、BRI3BP和APOD)与胃癌预后相关。利用这7个中心基因构建了基于随机生存森林(RSF)和广义增强回归模型(GBM)的高精度风险模型。该风险模型对胃癌患者预后有较好的预测能力,风险评分可作为胃癌的独立预后因素。此外,GC组织中CGB5、MATN3、MARCKS和APOD的蛋白表达水平显著高于正常组织,且与GC患者的病理特征相关。由7个枢纽基因组成的风险模型能够准确评估GC患者的预后,有助于GC患者的精准个性化治疗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A prognostic model for gastric cancer constructed by multiple machine learning algorithms

Gastric cancer (GC) is a highly heterogeneous disease that requires highly accurate prognostic models. Machine learning is a powerful tool for identifying predictive biomarkers and developing prognostic models. Here, we aim to integrate bioinformatics and machine learning algorithms to construct a risk model to predict prognosis of GC patients. Transcriptome data and clinical information of GC patients were obtained from the Cancer Genome Atlas (TCGA) database. Microarray data (GSE84437 and GSE26253) were obtained from the Gene Expression Omnibus (GEO) database. Univariate Cox regression analysis was used to screen prognostic genes. The risk genes closely related to prognosis were screened by machine learning algorithms and the risk score was calculated. Kaplan-Meier survival curve, time-dependent receiver operating characteristic (ROC) curve, univariate and multivariate Cox regression analysis were used to verify the validity of the risk model. The protein expression of hub genes in GC tissues was evaluated by immunohistochemical staining. 7 hub genes (CGB5, FEM1A, MATN3, ZNF101, MARCKS, BRI3BP and APOD) were identified and correlated with GC prognosis. A high-precision risk model based on random survival forest (RSF) and generalized boosted regression modelling (GBM) was constructed using these 7 hub genes. The risk model has good predictive ability for GC patients’ prognosis, and the risk score could be used as an independent prognostic factor for GC. In addition, the protein expression levels of CGB5, MATN3, MARCKS and APOD in GC tissues were significantly higher than those in normal tissues, and correlated with the pathological characteristics of GC patients. The risk model composed of 7 hub genes can accurately evaluate the prognosis of GC patients, which may contribute to the precise and personalized treatment of GC patients.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Molecular Histology
Journal of Molecular Histology 生物-细胞生物学
CiteScore
5.90
自引率
0.00%
发文量
68
审稿时长
1 months
期刊介绍: The Journal of Molecular Histology publishes results of original research on the localization and expression of molecules in animal cells, tissues and organs. Coverage includes studies describing novel cellular or ultrastructural distributions of molecules which provide insight into biochemical or physiological function, development, histologic structure and disease processes. Major research themes of particular interest include: - Cell-Cell and Cell-Matrix Interactions; - Connective Tissues; - Development and Disease; - Neuroscience. Please note that the Journal of Molecular Histology does not consider manuscripts dealing with the application of immunological or other probes on non-standard laboratory animal models unless the results are clearly of significant and general biological importance. The Journal of Molecular Histology publishes full-length original research papers, review articles, short communications and letters to the editors. All manuscripts are typically reviewed by two independent referees. The Journal of Molecular Histology is a continuation of The Histochemical Journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信