Construction of Prognostic Prediction Models for Colorectal Cancer Based on Ferroptosis-Related Genes: A Multi-Dataset and Multi-Model Analysis.

IF 2.3 Q3 ENGINEERING, BIOMEDICAL
Biomedical Engineering and Computational Biology Pub Date : 2024-11-02 eCollection Date: 2024-01-01 DOI:10.1177/11795972241293516
Tao Gan, Xiaomeng Wei, Yuanhao Xing, Zhili Hu
{"title":"Construction of Prognostic Prediction Models for Colorectal Cancer Based on Ferroptosis-Related Genes: A Multi-Dataset and Multi-Model Analysis.","authors":"Tao Gan, Xiaomeng Wei, Yuanhao Xing, Zhili Hu","doi":"10.1177/11795972241293516","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Colorectal cancer (CRC) remains a significant health burden globally, necessitating a deeper understanding of its molecular landscape and prognostic markers. This study characterized ferroptosis-related genes (FRGs) to construct models for predicting overall survival (OS) across various CRC datasets.</p><p><strong>Methods: </strong>In TCGA-COAD dataset, differentially expressed genes (DEGs) were identified between tumor and normal tissues using DESeq2 package. Prognostic genes were identified associated with OS, disease-specific survival, and progression-free interval using survival package. Additionally, FRGs were downloaded from FerrDb website, categorized into unclassified, marker, and driver genes. Finally, multiple models (Coxboost, Elastic Net, Gradient Boosting Machine, LASSO Regression, Partial Least Squares Regression for Cox Regression, Ridge Regression, Random Survival Forest [RSF], stepwise Cox Regression, Supervised Principal Components analysis, and Support Vector Machines) were employed to predict OS across multiple datasets (TCGA-COAD, GSE103479, GSE106584, GSE17536, GSE17537, GSE29621, GSE39084, GSE39582, and GSE72970) using intersection genes across DEGs, OS, disease-specific survival, and progression-free interval, and FRG categories.</p><p><strong>Results: </strong>Six intersection genes (ASNS, TIMP1, H19, CDKN2A, HOTAIR, and ASMTL-AS1) were identified, upregulated in tumor tissues, and associated with poor survival outcomes. In the TCGA-COAD dataset, the RSF model demonstrated the highest concordance index. Kaplan-Meier analysis revealed significantly lower OS probabilities in high-risk groups identified by the RSF model. The RSF model exhibited high accuracy with AUC values of 0.978, 0.985, and 0.965 for 1-, 3-, and 5-year survival predictions, respectively. Calibration curves demonstrated excellent agreement between predicted and observed survival probabilities. Decision curve analysis confirmed the clinical utility of the RSF model. Additionally, the model's performances were validated in GSE29621 dataset.</p><p><strong>Conclusions: </strong>The study underscores the prognostic relevance of 6 intersection genes in CRC, providing insights into potential therapeutic targets and biomarkers for patient stratification. The RSF model demonstrates robust predictive performance, suggesting its utility in clinical risk assessment and personalized treatment strategies.</p>","PeriodicalId":42484,"journal":{"name":"Biomedical Engineering and Computational Biology","volume":"15 ","pages":"11795972241293516"},"PeriodicalIF":2.3000,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11531666/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Engineering and Computational Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/11795972241293516","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Colorectal cancer (CRC) remains a significant health burden globally, necessitating a deeper understanding of its molecular landscape and prognostic markers. This study characterized ferroptosis-related genes (FRGs) to construct models for predicting overall survival (OS) across various CRC datasets.

Methods: In TCGA-COAD dataset, differentially expressed genes (DEGs) were identified between tumor and normal tissues using DESeq2 package. Prognostic genes were identified associated with OS, disease-specific survival, and progression-free interval using survival package. Additionally, FRGs were downloaded from FerrDb website, categorized into unclassified, marker, and driver genes. Finally, multiple models (Coxboost, Elastic Net, Gradient Boosting Machine, LASSO Regression, Partial Least Squares Regression for Cox Regression, Ridge Regression, Random Survival Forest [RSF], stepwise Cox Regression, Supervised Principal Components analysis, and Support Vector Machines) were employed to predict OS across multiple datasets (TCGA-COAD, GSE103479, GSE106584, GSE17536, GSE17537, GSE29621, GSE39084, GSE39582, and GSE72970) using intersection genes across DEGs, OS, disease-specific survival, and progression-free interval, and FRG categories.

Results: Six intersection genes (ASNS, TIMP1, H19, CDKN2A, HOTAIR, and ASMTL-AS1) were identified, upregulated in tumor tissues, and associated with poor survival outcomes. In the TCGA-COAD dataset, the RSF model demonstrated the highest concordance index. Kaplan-Meier analysis revealed significantly lower OS probabilities in high-risk groups identified by the RSF model. The RSF model exhibited high accuracy with AUC values of 0.978, 0.985, and 0.965 for 1-, 3-, and 5-year survival predictions, respectively. Calibration curves demonstrated excellent agreement between predicted and observed survival probabilities. Decision curve analysis confirmed the clinical utility of the RSF model. Additionally, the model's performances were validated in GSE29621 dataset.

Conclusions: The study underscores the prognostic relevance of 6 intersection genes in CRC, providing insights into potential therapeutic targets and biomarkers for patient stratification. The RSF model demonstrates robust predictive performance, suggesting its utility in clinical risk assessment and personalized treatment strategies.

基于铁突变相关基因构建结直肠癌预后预测模型:多数据集和多模型分析
背景:结直肠癌(CRC)仍然是全球重大的健康负担,因此有必要深入了解其分子结构和预后标志物。本研究对铁蛋白沉积相关基因(FRGs)进行了特征描述,以构建预测各种 CRC 数据集的总生存率(OS)的模型:在TCGA-COAD数据集中,使用DESeq2软件包鉴定了肿瘤组织和正常组织之间的差异表达基因(DEGs)。使用生存软件包鉴定与OS、疾病特异性生存和无进展间期相关的预后基因。此外,还从 FerrDb 网站下载了 FRGs,并将其分为未分类基因、标记基因和驱动基因。最后,使用多种模型(Coxboost、Elastic Net、Gradient Boosting Machine、LASSO 回归、Partial Least Squares Regression for Cox Regression、Ridge Regression、Random Survival Forest [RSF]、stepwise Cox Regression、Supervised Principal Components analysis、和支持向量机)来预测多个数据集(TCGA-COAD、GSE103479、GSE106584、GSE17536、GSE17537、GSE29621、GSE39084、GSE39582 和 GSE72970)的 OS,预测时使用了 DEGs、OS、疾病特异性生存期、无进展间隔和 FRG 类别的交叉基因。结果发现了六个交叉基因(ASNS、TIMP1、H19、CDKN2A、HOTAIR 和 ASMTL-AS1),它们在肿瘤组织中上调,并与不良生存结果相关。在 TCGA-COAD 数据集中,RSF 模型的一致性指数最高。Kaplan-Meier分析显示,在RSF模型确定的高风险组中,OS概率明显较低。RSF 模型的准确度很高,1 年、3 年和 5 年生存预测的 AUC 值分别为 0.978、0.985 和 0.965。校准曲线显示,预测的生存概率与观察到的生存概率非常吻合。决策曲线分析证实了 RSF 模型的临床实用性。此外,该模型的性能在 GSE29621 数据集中也得到了验证:该研究强调了 6 个交叉基因在 CRC 预后中的相关性,为潜在的治疗靶点和患者分层的生物标志物提供了见解。RSF模型显示出强大的预测性能,表明其在临床风险评估和个性化治疗策略中的实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
1
审稿时长
8 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信