Prognostic prediction of gastric cancer based on H&E findings and machine learning pathomics

IF 2.3 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS
Guoda Han , Xu Liu , Tian Gao , Lei Zhang , Xiaoling Zhang , Xiaonan Wei , Yecheng Lin , Bohong Yin
{"title":"Prognostic prediction of gastric cancer based on H&E findings and machine learning pathomics","authors":"Guoda Han ,&nbsp;Xu Liu ,&nbsp;Tian Gao ,&nbsp;Lei Zhang ,&nbsp;Xiaoling Zhang ,&nbsp;Xiaonan Wei ,&nbsp;Yecheng Lin ,&nbsp;Bohong Yin","doi":"10.1016/j.mcp.2024.101983","DOIUrl":null,"url":null,"abstract":"<div><h3>Aim</h3><div>In this research, we aimed to develop a model for the accurate prediction of gastric cancer based on H&amp;E findings combined with machine learning pathomics.</div></div><div><h3>Methods</h3><div>Transcriptome data, pathological images, and clinical data from 443 cases were retrieved from TCGA (The Cancer Genome Atlas Program) for survival analysis. The images were segmented using the Otsu algorithm, and features were extracted using the PyRadiomics package. Subsequently, the cases were randomly divided into a training cohort of 165 cases and a validation cohort of 69 cases. Features selected via minimum Redundancy - Maximum Relevance (mRMR)- recursive feature elimination (RFE) screening were used to train a model using the Gradient Boosting Machine (GBM) algorithm. The model's performance was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), calibration curves, and decision curves. Additionally, the correlation between the Pathomics score (PS) and immune genes was examined.</div></div><div><h3>Results</h3><div>In the multivariate analysis, heightened infiltration of activated CD4 memory T cells was strongly associated with improved overall survival (HR = 0.505, 95 % CI = 0.342–0.745, P &lt; 0.001). The pathomic model, exhibiting robust predictive capability, demonstrated impressive AUC values of 0.844 and 0.750 in both study cohorts. The Decision Curve Analysis (DCA) unequivocally underscored the model's exceptional clinical utility. In a subsequent multivariate analysis, heightened infiltration of the PS also emerged as a significant protective factor for overall survival (HR = 0.506, 95 % CI = 0.329–0.777, P = 0.002).</div></div><div><h3>Conclusion</h3><div>The pathomic model based on H&amp;E slides for predicting the infiltration degree of activated CD4 memory T cells, along with integrated bioinformatics analysis elucidating potential molecular mechanisms, offers novel prognostic indicators for the precise stratification and individualized prognosis of gastric cancer patients.</div></div>","PeriodicalId":49799,"journal":{"name":"Molecular and Cellular Probes","volume":"78 ","pages":"Article 101983"},"PeriodicalIF":2.3000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular and Cellular Probes","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0890850824000355","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Aim

In this research, we aimed to develop a model for the accurate prediction of gastric cancer based on H&E findings combined with machine learning pathomics.

Methods

Transcriptome data, pathological images, and clinical data from 443 cases were retrieved from TCGA (The Cancer Genome Atlas Program) for survival analysis. The images were segmented using the Otsu algorithm, and features were extracted using the PyRadiomics package. Subsequently, the cases were randomly divided into a training cohort of 165 cases and a validation cohort of 69 cases. Features selected via minimum Redundancy - Maximum Relevance (mRMR)- recursive feature elimination (RFE) screening were used to train a model using the Gradient Boosting Machine (GBM) algorithm. The model's performance was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), calibration curves, and decision curves. Additionally, the correlation between the Pathomics score (PS) and immune genes was examined.

Results

In the multivariate analysis, heightened infiltration of activated CD4 memory T cells was strongly associated with improved overall survival (HR = 0.505, 95 % CI = 0.342–0.745, P < 0.001). The pathomic model, exhibiting robust predictive capability, demonstrated impressive AUC values of 0.844 and 0.750 in both study cohorts. The Decision Curve Analysis (DCA) unequivocally underscored the model's exceptional clinical utility. In a subsequent multivariate analysis, heightened infiltration of the PS also emerged as a significant protective factor for overall survival (HR = 0.506, 95 % CI = 0.329–0.777, P = 0.002).

Conclusion

The pathomic model based on H&E slides for predicting the infiltration degree of activated CD4 memory T cells, along with integrated bioinformatics analysis elucidating potential molecular mechanisms, offers novel prognostic indicators for the precise stratification and individualized prognosis of gastric cancer patients.
基于 H&E 检查结果和机器学习病理组学的胃癌预后预测。
目的:在这项研究中,我们旨在开发一种基于H&E结果并结合机器学习病理组学的胃癌准确预测模型:方法:我们从 TCGA(癌症基因组图谱计划)中获取了 443 个病例的转录组数据、病理图像和临床数据,用于生存分析。使用Otsu算法对图像进行分割,并使用PyRadiomics软件包提取特征。随后,病例被随机分为 165 例训练队列和 69 例验证队列。通过最小冗余-最大相关性(mRMR)-递归特征剔除(RFE)筛选出的特征被用于使用梯度提升机(GBM)算法训练模型。使用接收者操作特征曲线(ROC)下面积(AUC)、校准曲线和决策曲线对模型的性能进行了评估。此外,还研究了病理组学评分(PS)与免疫基因之间的相关性:在多变量分析中,活化的 CD4 记忆 T 细胞浸润增加与总生存期的改善密切相关(HR = 0.505,95% CI = 0.342-0.745,P <0.001)。病理模型具有强大的预测能力,在两个研究队列中的AUC值分别为0.844和0.750,令人印象深刻。决策曲线分析(DCA)明确强调了该模型卓越的临床实用性。在随后的多变量分析中,PS的高度浸润也成为总生存率的重要保护因素(HR = 0.506,95% CI = 0.329-0.777,P = 0.002):基于H&E切片预测活化CD4记忆T细胞浸润程度的病理模型,以及阐明潜在分子机制的综合生物信息学分析,为胃癌患者的精确分层和个体化预后提供了新的预后指标。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Molecular and Cellular Probes
Molecular and Cellular Probes 生物-生化研究方法
CiteScore
6.80
自引率
0.00%
发文量
52
审稿时长
16 days
期刊介绍: MCP - Advancing biology through–omics and bioinformatic technologies wants to capture outcomes from the current revolution in molecular technologies and sciences. The journal has broadened its scope and embraces any high quality research papers, reviews and opinions in areas including, but not limited to, molecular biology, cell biology, biochemistry, immunology, physiology, epidemiology, ecology, virology, microbiology, parasitology, genetics, evolutionary biology, genomics (including metagenomics), bioinformatics, proteomics, metabolomics, glycomics, and lipidomics. Submissions with a technology-driven focus on understanding normal biological or disease processes as well as conceptual advances and paradigm shifts are particularly encouraged. The Editors welcome fundamental or applied research areas; pre-submission enquiries about advanced draft manuscripts are welcomed. Top quality research and manuscripts will be fast-tracked.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信