Cross-attention enables deep learning on limited omics-imaging-clinical data of 130 lung cancer patients.

IF 4.3 Q1 BIOCHEMICAL RESEARCH METHODS
Cell Reports Methods Pub Date : 2024-07-15 Epub Date: 2024-07-08 DOI:10.1016/j.crmeth.2024.100817
Suraj Verma, Giuseppe Magazzù, Noushin Eftekhari, Thai Lou, Alex Gilhespy, Annalisa Occhipinti, Claudio Angione
{"title":"Cross-attention enables deep learning on limited omics-imaging-clinical data of 130 lung cancer patients.","authors":"Suraj Verma, Giuseppe Magazzù, Noushin Eftekhari, Thai Lou, Alex Gilhespy, Annalisa Occhipinti, Claudio Angione","doi":"10.1016/j.crmeth.2024.100817","DOIUrl":null,"url":null,"abstract":"<p><p>Deep-learning tools that extract prognostic factors derived from multi-omics data have recently contributed to individualized predictions of survival outcomes. However, the limited size of integrated omics-imaging-clinical datasets poses challenges. Here, we propose two biologically interpretable and robust deep-learning architectures for survival prediction of non-small cell lung cancer (NSCLC) patients, learning simultaneously from computed tomography (CT) scan images, gene expression data, and clinical information. The proposed models integrate patient-specific clinical, transcriptomic, and imaging data and incorporate Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome pathway information, adding biological knowledge within the learning process to extract prognostic gene biomarkers and molecular pathways. While both models accurately stratify patients in high- and low-risk groups when trained on a dataset of only 130 patients, introducing a cross-attention mechanism in a sparse autoencoder significantly improves the performance, highlighting tumor regions and NSCLC-related genes as potential biomarkers and thus offering a significant methodological advancement when learning from small imaging-omics-clinical samples.</p>","PeriodicalId":29773,"journal":{"name":"Cell Reports Methods","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11294841/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell Reports Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.crmeth.2024.100817","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/8 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Deep-learning tools that extract prognostic factors derived from multi-omics data have recently contributed to individualized predictions of survival outcomes. However, the limited size of integrated omics-imaging-clinical datasets poses challenges. Here, we propose two biologically interpretable and robust deep-learning architectures for survival prediction of non-small cell lung cancer (NSCLC) patients, learning simultaneously from computed tomography (CT) scan images, gene expression data, and clinical information. The proposed models integrate patient-specific clinical, transcriptomic, and imaging data and incorporate Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome pathway information, adding biological knowledge within the learning process to extract prognostic gene biomarkers and molecular pathways. While both models accurately stratify patients in high- and low-risk groups when trained on a dataset of only 130 patients, introducing a cross-attention mechanism in a sparse autoencoder significantly improves the performance, highlighting tumor regions and NSCLC-related genes as potential biomarkers and thus offering a significant methodological advancement when learning from small imaging-omics-clinical samples.

通过交叉关注,对 130 名肺癌患者的有限全息成像临床数据进行深度学习。
从多组学数据中提取预后因素的深度学习工具最近为生存结果的个体化预测做出了贡献。然而,综合组学-成像-临床数据集的规模有限带来了挑战。在此,我们提出了两种可从生物学角度解释的、稳健的深度学习架构,用于同时从计算机断层扫描(CT)图像、基因表达数据和临床信息中学习,预测非小细胞肺癌(NSCLC)患者的生存期。所提出的模型整合了患者特定的临床、转录组和成像数据,并结合了京都基因和基因组百科全书(KEGG)和Reactome通路信息,在学习过程中增加了生物学知识,以提取预后基因生物标志物和分子通路。在仅有130名患者的数据集上进行训练时,这两种模型都能准确地将患者分为高危和低危两组,而在稀疏自动编码器中引入交叉注意机制则能显著提高性能,突出肿瘤区域和NSCLC相关基因作为潜在的生物标记物,从而在从小型成像-组学-临床样本中学习的方法上取得了重大进步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Cell Reports Methods
Cell Reports Methods Chemistry (General), Biochemistry, Genetics and Molecular Biology (General), Immunology and Microbiology (General)
CiteScore
3.80
自引率
0.00%
发文量
0
审稿时长
111 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信