Jilei Liu, Meng Yang, Yajing Bi, Junqing Zhang, Yichen Yang, Yang Li, Hongru Shen, Kexin Chen, Xiangchun Li
{"title":"Large language models enable tumor-type classification and localization of cancers of unknown primary from genomic data.","authors":"Jilei Liu, Meng Yang, Yajing Bi, Junqing Zhang, Yichen Yang, Yang Li, Hongru Shen, Kexin Chen, Xiangchun Li","doi":"10.1016/j.xcrm.2025.102332","DOIUrl":null,"url":null,"abstract":"<p><p>Tumor-type classification is critical for effective cancer treatment, yet current methods based on genomic alterations lack flexibility and have limited performance. Here, we introduce OncoChat, an artificial intelligence (AI) model designed to classify 69 tumor types by integrating diverse genomic alterations. Developed on genomic data from 158,836 tumors sequenced with targeted cancer gene panels, OncoChat demonstrates superior performance, achieving a micro-averaged precision-recall area under the curve (PRAUC) of 0.810 (95% confidence interval [CI], 0.803-0.816), accuracy of 0.774, and an F1 score of 0.756, outperforming baseline methods. In a cancer of unknown primary (CUP) dataset of 26 cases whose types were subsequently confirmed, OncoChat correctly identified 22 cases. In two larger CUP datasets (n = 719 and 158), tumor types predicted by OncoChat were associated with survival outcomes and mutation profiles consistent with those of known tumor types. OncoChat offers promising potential for clinical decision support, particularly in managing patients with CUP.</p>","PeriodicalId":9822,"journal":{"name":"Cell Reports Medicine","volume":" ","pages":"102332"},"PeriodicalIF":10.6000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490231/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell Reports Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.xcrm.2025.102332","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/4 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Tumor-type classification is critical for effective cancer treatment, yet current methods based on genomic alterations lack flexibility and have limited performance. Here, we introduce OncoChat, an artificial intelligence (AI) model designed to classify 69 tumor types by integrating diverse genomic alterations. Developed on genomic data from 158,836 tumors sequenced with targeted cancer gene panels, OncoChat demonstrates superior performance, achieving a micro-averaged precision-recall area under the curve (PRAUC) of 0.810 (95% confidence interval [CI], 0.803-0.816), accuracy of 0.774, and an F1 score of 0.756, outperforming baseline methods. In a cancer of unknown primary (CUP) dataset of 26 cases whose types were subsequently confirmed, OncoChat correctly identified 22 cases. In two larger CUP datasets (n = 719 and 158), tumor types predicted by OncoChat were associated with survival outcomes and mutation profiles consistent with those of known tumor types. OncoChat offers promising potential for clinical decision support, particularly in managing patients with CUP.
Cell Reports MedicineBiochemistry, Genetics and Molecular Biology-Biochemistry, Genetics and Molecular Biology (all)
CiteScore
15.00
自引率
1.40%
发文量
231
审稿时长
40 days
期刊介绍:
Cell Reports Medicine is an esteemed open-access journal by Cell Press that publishes groundbreaking research in translational and clinical biomedical sciences, influencing human health and medicine.
Our journal ensures wide visibility and accessibility, reaching scientists and clinicians across various medical disciplines. We publish original research that spans from intriguing human biology concepts to all aspects of clinical work. We encourage submissions that introduce innovative ideas, forging new paths in clinical research and practice. We also welcome studies that provide vital information, enhancing our understanding of current standards of care in diagnosis, treatment, and prognosis. This encompasses translational studies, clinical trials (including long-term follow-ups), genomics, biomarker discovery, and technological advancements that contribute to diagnostics, treatment, and healthcare. Additionally, studies based on vertebrate model organisms are within the scope of the journal, as long as they directly relate to human health and disease.