大型语言模型能够从基因组数据中对未知原发癌症进行肿瘤类型分类和定位。

IF 10.6 1区 医学 Q1 CELL BIOLOGY
Cell Reports Medicine Pub Date : 2025-09-16 Epub Date: 2025-09-04 DOI:10.1016/j.xcrm.2025.102332
Jilei Liu, Meng Yang, Yajing Bi, Junqing Zhang, Yichen Yang, Yang Li, Hongru Shen, Kexin Chen, Xiangchun Li
{"title":"大型语言模型能够从基因组数据中对未知原发癌症进行肿瘤类型分类和定位。","authors":"Jilei Liu, Meng Yang, Yajing Bi, Junqing Zhang, Yichen Yang, Yang Li, Hongru Shen, Kexin Chen, Xiangchun Li","doi":"10.1016/j.xcrm.2025.102332","DOIUrl":null,"url":null,"abstract":"<p><p>Tumor-type classification is critical for effective cancer treatment, yet current methods based on genomic alterations lack flexibility and have limited performance. Here, we introduce OncoChat, an artificial intelligence (AI) model designed to classify 69 tumor types by integrating diverse genomic alterations. Developed on genomic data from 158,836 tumors sequenced with targeted cancer gene panels, OncoChat demonstrates superior performance, achieving a micro-averaged precision-recall area under the curve (PRAUC) of 0.810 (95% confidence interval [CI], 0.803-0.816), accuracy of 0.774, and an F1 score of 0.756, outperforming baseline methods. In a cancer of unknown primary (CUP) dataset of 26 cases whose types were subsequently confirmed, OncoChat correctly identified 22 cases. In two larger CUP datasets (n = 719 and 158), tumor types predicted by OncoChat were associated with survival outcomes and mutation profiles consistent with those of known tumor types. OncoChat offers promising potential for clinical decision support, particularly in managing patients with CUP.</p>","PeriodicalId":9822,"journal":{"name":"Cell Reports Medicine","volume":" ","pages":"102332"},"PeriodicalIF":10.6000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490231/pdf/","citationCount":"0","resultStr":"{\"title\":\"Large language models enable tumor-type classification and localization of cancers of unknown primary from genomic data.\",\"authors\":\"Jilei Liu, Meng Yang, Yajing Bi, Junqing Zhang, Yichen Yang, Yang Li, Hongru Shen, Kexin Chen, Xiangchun Li\",\"doi\":\"10.1016/j.xcrm.2025.102332\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Tumor-type classification is critical for effective cancer treatment, yet current methods based on genomic alterations lack flexibility and have limited performance. Here, we introduce OncoChat, an artificial intelligence (AI) model designed to classify 69 tumor types by integrating diverse genomic alterations. Developed on genomic data from 158,836 tumors sequenced with targeted cancer gene panels, OncoChat demonstrates superior performance, achieving a micro-averaged precision-recall area under the curve (PRAUC) of 0.810 (95% confidence interval [CI], 0.803-0.816), accuracy of 0.774, and an F1 score of 0.756, outperforming baseline methods. In a cancer of unknown primary (CUP) dataset of 26 cases whose types were subsequently confirmed, OncoChat correctly identified 22 cases. In two larger CUP datasets (n = 719 and 158), tumor types predicted by OncoChat were associated with survival outcomes and mutation profiles consistent with those of known tumor types. OncoChat offers promising potential for clinical decision support, particularly in managing patients with CUP.</p>\",\"PeriodicalId\":9822,\"journal\":{\"name\":\"Cell Reports Medicine\",\"volume\":\" \",\"pages\":\"102332\"},\"PeriodicalIF\":10.6000,\"publicationDate\":\"2025-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490231/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cell Reports Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.xcrm.2025.102332\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/9/4 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"CELL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell Reports Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.xcrm.2025.102332","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/4 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

肿瘤类型分类对于有效的癌症治疗至关重要,然而目前基于基因组改变的方法缺乏灵活性并且性能有限。在这里,我们介绍了OncoChat,一种人工智能(AI)模型,旨在通过整合不同的基因组改变来对69种肿瘤类型进行分类。OncoChat基于158,836个肿瘤的基因组数据,使用靶向癌症基因面板进行测序,显示出卓越的性能,实现了曲线下的微平均精确召回面积(PRAUC)为0.810(95%置信区间[CI], 0.803-0.816),准确率为0.774,F1评分为0.756,优于基线方法。在26例未知原发癌症(CUP)的数据集中,其类型随后得到确认,OncoChat正确识别了22例。在两个较大的CUP数据集(n = 719和158)中,OncoChat预测的肿瘤类型与生存结果和突变谱相关,与已知肿瘤类型一致。OncoChat为临床决策支持提供了巨大的潜力,特别是在管理CUP患者方面。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Large language models enable tumor-type classification and localization of cancers of unknown primary from genomic data.

Tumor-type classification is critical for effective cancer treatment, yet current methods based on genomic alterations lack flexibility and have limited performance. Here, we introduce OncoChat, an artificial intelligence (AI) model designed to classify 69 tumor types by integrating diverse genomic alterations. Developed on genomic data from 158,836 tumors sequenced with targeted cancer gene panels, OncoChat demonstrates superior performance, achieving a micro-averaged precision-recall area under the curve (PRAUC) of 0.810 (95% confidence interval [CI], 0.803-0.816), accuracy of 0.774, and an F1 score of 0.756, outperforming baseline methods. In a cancer of unknown primary (CUP) dataset of 26 cases whose types were subsequently confirmed, OncoChat correctly identified 22 cases. In two larger CUP datasets (n = 719 and 158), tumor types predicted by OncoChat were associated with survival outcomes and mutation profiles consistent with those of known tumor types. OncoChat offers promising potential for clinical decision support, particularly in managing patients with CUP.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Cell Reports Medicine
Cell Reports Medicine Biochemistry, Genetics and Molecular Biology-Biochemistry, Genetics and Molecular Biology (all)
CiteScore
15.00
自引率
1.40%
发文量
231
审稿时长
40 days
期刊介绍: Cell Reports Medicine is an esteemed open-access journal by Cell Press that publishes groundbreaking research in translational and clinical biomedical sciences, influencing human health and medicine. Our journal ensures wide visibility and accessibility, reaching scientists and clinicians across various medical disciplines. We publish original research that spans from intriguing human biology concepts to all aspects of clinical work. We encourage submissions that introduce innovative ideas, forging new paths in clinical research and practice. We also welcome studies that provide vital information, enhancing our understanding of current standards of care in diagnosis, treatment, and prognosis. This encompasses translational studies, clinical trials (including long-term follow-ups), genomics, biomarker discovery, and technological advancements that contribute to diagnostics, treatment, and healthcare. Additionally, studies based on vertebrate model organisms are within the scope of the journal, as long as they directly relate to human health and disease.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信