使用大型语言模型驱动的代理加速扩增子测序引物设计

IF 26.8 1区医学 Q1 ENGINEERING, BIOMEDICAL

Nature Biomedical Engineering Pub Date : 2025-07-30 DOI:10.1038/s41551-025-01455-z

Yi Wang, Yuejie Hou, Lin Yang, Shisen Li, Weiting Tang, Hui Tang, Qiushun He, Siyuan Lin, Yanyan Zhang, Xingyu Li, Shiwen Chen, Yusheng Huang, Lingsong Kong, Huijun Zhang, Duncan Yu, Feng Mu, Huanming Yang, Jian Wang, Nattiya Hirankarn, Meng Yang

{"title":"使用大型语言模型驱动的代理加速扩增子测序引物设计","authors":"Yi Wang, Yuejie Hou, Lin Yang, Shisen Li, Weiting Tang, Hui Tang, Qiushun He, Siyuan Lin, Yanyan Zhang, Xingyu Li, Shiwen Chen, Yusheng Huang, Lingsong Kong, Huijun Zhang, Duncan Yu, Feng Mu, Huanming Yang, Jian Wang, Nattiya Hirankarn, Meng Yang","doi":"10.1038/s41551-025-01455-z","DOIUrl":null,"url":null,"abstract":"<p>The pre-trained knowledge compressed in large language models is addressing diverse scientific challenges and catalysing the progression of autonomous laboratory systems, synergized with liquid handling robots. Here we introduce PrimeGen, an orchestrated multi-agent system powered by large language models, designed to streamline labour-intensive primer design tasks for targeted next-generation sequencing. PrimeGen uses GPT-4o as a central controller to engage with experimentalists for task planning and decomposition, coordinating various specialized agents to execute distinct subtasks. These include an interactive search agent for retrieving gene targets from databases, a primer agent for designing primer sequences across multiple scenarios, a protocol agent for generating executable robot scripts through retrieval-augmented generation and prompt engineering, and an experiment agent equipped with a vision language model for detecting and reporting anomalies. We experimentally demonstrate the effectiveness of PrimeGen across a variety of applications. PrimeGen can accommodate up to 955 amplicons, ensuring high amplification uniformity and minimizing dimer formation. Our development underscores the potential of collaborative agents, coordinated by generalist foundation models, as intelligent tools for advancing biomedical research.</p>","PeriodicalId":19063,"journal":{"name":"Nature Biomedical Engineering","volume":"148 1","pages":""},"PeriodicalIF":26.8000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accelerating primer design for amplicon sequencing using large language model-powered agents\",\"authors\":\"Yi Wang, Yuejie Hou, Lin Yang, Shisen Li, Weiting Tang, Hui Tang, Qiushun He, Siyuan Lin, Yanyan Zhang, Xingyu Li, Shiwen Chen, Yusheng Huang, Lingsong Kong, Huijun Zhang, Duncan Yu, Feng Mu, Huanming Yang, Jian Wang, Nattiya Hirankarn, Meng Yang\",\"doi\":\"10.1038/s41551-025-01455-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The pre-trained knowledge compressed in large language models is addressing diverse scientific challenges and catalysing the progression of autonomous laboratory systems, synergized with liquid handling robots. Here we introduce PrimeGen, an orchestrated multi-agent system powered by large language models, designed to streamline labour-intensive primer design tasks for targeted next-generation sequencing. PrimeGen uses GPT-4o as a central controller to engage with experimentalists for task planning and decomposition, coordinating various specialized agents to execute distinct subtasks. These include an interactive search agent for retrieving gene targets from databases, a primer agent for designing primer sequences across multiple scenarios, a protocol agent for generating executable robot scripts through retrieval-augmented generation and prompt engineering, and an experiment agent equipped with a vision language model for detecting and reporting anomalies. We experimentally demonstrate the effectiveness of PrimeGen across a variety of applications. PrimeGen can accommodate up to 955 amplicons, ensuring high amplification uniformity and minimizing dimer formation. Our development underscores the potential of collaborative agents, coordinated by generalist foundation models, as intelligent tools for advancing biomedical research.</p>\",\"PeriodicalId\":19063,\"journal\":{\"name\":\"Nature Biomedical Engineering\",\"volume\":\"148 1\",\"pages\":\"\"},\"PeriodicalIF\":26.8000,\"publicationDate\":\"2025-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature Biomedical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1038/s41551-025-01455-z\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Biomedical Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1038/s41551-025-01455-z","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

压缩在大型语言模型中的预训练知识正在解决各种科学挑战，并促进自主实验室系统的发展，与液体处理机器人协同工作。在这里，我们介绍了PrimeGen，一个由大型语言模型驱动的精心编排的多智能体系统，旨在简化劳动密集型引物设计任务，用于目标下一代测序。PrimeGen使用gpt - 40作为中央控制器，与实验人员进行任务规划和分解，协调各种专门代理执行不同的子任务。其中包括用于从数据库中检索基因目标的交互式搜索代理，用于跨多种场景设计引物序列的引物代理，用于通过检索增强生成和提示工程生成可执行机器人脚本的协议代理，以及配备用于检测和报告异常的视觉语言模型的实验代理。我们通过实验证明了PrimeGen在各种应用中的有效性。PrimeGen可以容纳多达955个扩增子，确保高扩增均匀性和最小化二聚体的形成。我们的发展强调了协作代理的潜力，由通才基础模型协调，作为推进生物医学研究的智能工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Accelerating primer design for amplicon sequencing using large language model-powered agents

查看原文本刊更多论文

Accelerating primer design for amplicon sequencing using large language model-powered agents

The pre-trained knowledge compressed in large language models is addressing diverse scientific challenges and catalysing the progression of autonomous laboratory systems, synergized with liquid handling robots. Here we introduce PrimeGen, an orchestrated multi-agent system powered by large language models, designed to streamline labour-intensive primer design tasks for targeted next-generation sequencing. PrimeGen uses GPT-4o as a central controller to engage with experimentalists for task planning and decomposition, coordinating various specialized agents to execute distinct subtasks. These include an interactive search agent for retrieving gene targets from databases, a primer agent for designing primer sequences across multiple scenarios, a protocol agent for generating executable robot scripts through retrieval-augmented generation and prompt engineering, and an experiment agent equipped with a vision language model for detecting and reporting anomalies. We experimentally demonstrate the effectiveness of PrimeGen across a variety of applications. PrimeGen can accommodate up to 955 amplicons, ensuring high amplification uniformity and minimizing dimer formation. Our development underscores the potential of collaborative agents, coordinated by generalist foundation models, as intelligent tools for advancing biomedical research.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Nature Biomedical Engineering Medicine-Medicine (miscellaneous)

CiteScore

45.30

自引率

1.10%

发文量

138

期刊介绍： Nature Biomedical Engineering is an online-only monthly journal that was launched in January 2017. It aims to publish original research, reviews, and commentary focusing on applied biomedicine and health technology. The journal targets a diverse audience, including life scientists who are involved in developing experimental or computational systems and methods to enhance our understanding of human physiology. It also covers biomedical researchers and engineers who are engaged in designing or optimizing therapies, assays, devices, or procedures for diagnosing or treating diseases. Additionally, clinicians, who make use of research outputs to evaluate patient health or administer therapy in various clinical settings and healthcare contexts, are also part of the target audience.