Guoda Han , Xu Liu , Tian Gao , Lei Zhang , Xiaoling Zhang , Xiaonan Wei , Yecheng Lin , Bohong Yin
{"title":"Prognostic prediction of gastric cancer based on H&E findings and machine learning pathomics","authors":"Guoda Han , Xu Liu , Tian Gao , Lei Zhang , Xiaoling Zhang , Xiaonan Wei , Yecheng Lin , Bohong Yin","doi":"10.1016/j.mcp.2024.101983","DOIUrl":null,"url":null,"abstract":"<div><h3>Aim</h3><div>In this research, we aimed to develop a model for the accurate prediction of gastric cancer based on H&E findings combined with machine learning pathomics.</div></div><div><h3>Methods</h3><div>Transcriptome data, pathological images, and clinical data from 443 cases were retrieved from TCGA (The Cancer Genome Atlas Program) for survival analysis. The images were segmented using the Otsu algorithm, and features were extracted using the PyRadiomics package. Subsequently, the cases were randomly divided into a training cohort of 165 cases and a validation cohort of 69 cases. Features selected via minimum Redundancy - Maximum Relevance (mRMR)- recursive feature elimination (RFE) screening were used to train a model using the Gradient Boosting Machine (GBM) algorithm. The model's performance was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), calibration curves, and decision curves. Additionally, the correlation between the Pathomics score (PS) and immune genes was examined.</div></div><div><h3>Results</h3><div>In the multivariate analysis, heightened infiltration of activated CD4 memory T cells was strongly associated with improved overall survival (HR = 0.505, 95 % CI = 0.342–0.745, P < 0.001). The pathomic model, exhibiting robust predictive capability, demonstrated impressive AUC values of 0.844 and 0.750 in both study cohorts. The Decision Curve Analysis (DCA) unequivocally underscored the model's exceptional clinical utility. In a subsequent multivariate analysis, heightened infiltration of the PS also emerged as a significant protective factor for overall survival (HR = 0.506, 95 % CI = 0.329–0.777, P = 0.002).</div></div><div><h3>Conclusion</h3><div>The pathomic model based on H&E slides for predicting the infiltration degree of activated CD4 memory T cells, along with integrated bioinformatics analysis elucidating potential molecular mechanisms, offers novel prognostic indicators for the precise stratification and individualized prognosis of gastric cancer patients.</div></div>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0890850824000355","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0
Abstract
Aim
In this research, we aimed to develop a model for the accurate prediction of gastric cancer based on H&E findings combined with machine learning pathomics.
Methods
Transcriptome data, pathological images, and clinical data from 443 cases were retrieved from TCGA (The Cancer Genome Atlas Program) for survival analysis. The images were segmented using the Otsu algorithm, and features were extracted using the PyRadiomics package. Subsequently, the cases were randomly divided into a training cohort of 165 cases and a validation cohort of 69 cases. Features selected via minimum Redundancy - Maximum Relevance (mRMR)- recursive feature elimination (RFE) screening were used to train a model using the Gradient Boosting Machine (GBM) algorithm. The model's performance was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), calibration curves, and decision curves. Additionally, the correlation between the Pathomics score (PS) and immune genes was examined.
Results
In the multivariate analysis, heightened infiltration of activated CD4 memory T cells was strongly associated with improved overall survival (HR = 0.505, 95 % CI = 0.342–0.745, P < 0.001). The pathomic model, exhibiting robust predictive capability, demonstrated impressive AUC values of 0.844 and 0.750 in both study cohorts. The Decision Curve Analysis (DCA) unequivocally underscored the model's exceptional clinical utility. In a subsequent multivariate analysis, heightened infiltration of the PS also emerged as a significant protective factor for overall survival (HR = 0.506, 95 % CI = 0.329–0.777, P = 0.002).
Conclusion
The pathomic model based on H&E slides for predicting the infiltration degree of activated CD4 memory T cells, along with integrated bioinformatics analysis elucidating potential molecular mechanisms, offers novel prognostic indicators for the precise stratification and individualized prognosis of gastric cancer patients.