Survey and improvement strategies for gene prioritization with large language models.

IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Bioinformatics advances Pub Date : 2025-06-24 eCollection Date: 2025-01-01 DOI:10.1093/bioadv/vbaf148

Matthew B Neeley, Guantong Qi, Guanchu Wang, Ruixiang Tang, Dongxue Mao, Chaozhong Liu, Sasidhar Pasupuleti, Bo Yuan, Fan Xia, Pengfei Liu, Zhandong Liu, Xia Hu

{"title":"Survey and improvement strategies for gene prioritization with large language models.","authors":"Matthew B Neeley, Guantong Qi, Guanchu Wang, Ruixiang Tang, Dongxue Mao, Chaozhong Liu, Sasidhar Pasupuleti, Bo Yuan, Fan Xia, Pengfei Liu, Zhandong Liu, Xia Hu","doi":"10.1093/bioadv/vbaf148","DOIUrl":null,"url":null,"abstract":"Motivation: Rare diseases remain difficult to diagnose due to limited patient data and genetic diversity, with many cases remaining undiagnosed despite advances in variant prioritization tools. While large language models have shown promise in medical applications, their optimal application for trustworthy and accurate gene prioritization downstream of modern prioritization tools has not been systematically evaluated.Results: We benchmarked various language models for gene prioritization using multi-agent and Human Phenotype Ontology classification approaches to categorize patient cases by phenotype-based solvability levels. To address language model limitations in ranking large gene sets, we implemented a divide-and-conquer strategy with mini-batching and token limiting for improved efficiency. GPT-4 outperformed other language models across all patient datasets, demonstrating superior accuracy in ranking causal genes. Multi-agent and Human Phenotype Ontology classification approaches effectively distinguished between confidently-solved and challenging cases. However, we observed bias toward well-studied genes and input order sensitivity as notable language model limitations. Our divide-and-conquer strategy enhanced accuracy, overcoming positional and gene frequency biases in literature. This framework optimized the overall process for identifying disease-causal genes compared to baseline evaluation, better enabling targeted diagnostic and therapeutic interventions and streamlining diagnosis of rare genetic disorders.Availability and implementation: Software and additional material is available at: https://github.com/LiuzLab/GPT-Diagnosis.","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf148"},"PeriodicalIF":2.8000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12263109/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbaf148","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Motivation: Rare diseases remain difficult to diagnose due to limited patient data and genetic diversity, with many cases remaining undiagnosed despite advances in variant prioritization tools. While large language models have shown promise in medical applications, their optimal application for trustworthy and accurate gene prioritization downstream of modern prioritization tools has not been systematically evaluated.

Results: We benchmarked various language models for gene prioritization using multi-agent and Human Phenotype Ontology classification approaches to categorize patient cases by phenotype-based solvability levels. To address language model limitations in ranking large gene sets, we implemented a divide-and-conquer strategy with mini-batching and token limiting for improved efficiency. GPT-4 outperformed other language models across all patient datasets, demonstrating superior accuracy in ranking causal genes. Multi-agent and Human Phenotype Ontology classification approaches effectively distinguished between confidently-solved and challenging cases. However, we observed bias toward well-studied genes and input order sensitivity as notable language model limitations. Our divide-and-conquer strategy enhanced accuracy, overcoming positional and gene frequency biases in literature. This framework optimized the overall process for identifying disease-causal genes compared to baseline evaluation, better enabling targeted diagnostic and therapeutic interventions and streamlining diagnosis of rare genetic disorders.

Availability and implementation: Software and additional material is available at: https://github.com/LiuzLab/GPT-Diagnosis.

查看原文本刊更多论文

基于大型语言模型的基因优先排序研究与改进策略。

动机：由于患者数据和遗传多样性有限，罕见病仍然难以诊断，尽管变异优先排序工具有所进步，但许多病例仍未确诊。虽然大型语言模型在医学应用中显示出前景，但它们在现代优先排序工具下游可靠和准确的基因优先排序方面的最佳应用尚未得到系统评估。结果：我们使用多智能体和人类表型本体分类方法对基因优先级的各种语言模型进行基准测试，根据基于表型的可解决性水平对患者病例进行分类。为了解决语言模型在对大型基因集进行排序时的限制，我们实现了一种分而治之的策略，该策略带有小批处理和令牌限制，以提高效率。GPT-4在所有患者数据集中表现优于其他语言模型，在排序因果基因方面表现出卓越的准确性。多智能体和人类表型本体分类方法有效地区分了自信解决和具有挑战性的案例。然而，我们观察到对研究良好的基因和输入顺序敏感性的偏见是显著的语言模型局限性。我们的分而治之策略提高了准确性，克服了文献中的位置和基因频率偏差。与基线评估相比，该框架优化了确定致病基因的整个过程，更好地实现了有针对性的诊断和治疗干预，并简化了罕见遗传疾病的诊断。可用性和实现：软件和其他材料可在：https://github.com/LiuzLab/GPT-Diagnosis上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bioinformatics advances

CiteScore

1.60

自引率

0.00%

发文量