Yang Liu, Rongbo Shen, Lu Zhou, Qingyu Xiao, Jiao Yuan, Yixue Li
{"title":"用于大规模组学研究和科学见解的数据智能密集型生物信息学副驾驶系统。","authors":"Yang Liu, Rongbo Shen, Lu Zhou, Qingyu Xiao, Jiao Yuan, Yixue Li","doi":"10.1093/bib/bbaf312","DOIUrl":null,"url":null,"abstract":"<p><p>Advancements in high-throughput sequencing technologies and artificial intelligence (AI) offer unprecedented opportunities for groundbreaking discoveries in bioinformatics research. However, the challenges of exponential growth of omics data and the rapid development of AI technologies require automated big biological data analysis capability and interdisciplinary knowledge-driven scientific insight. Here, we propose a data-intelligence-intensive bioinformatics copilot (Bio-Copilot) system that synergizes AI capabilities with human researchers to facilitate hypothesis-free exploratory research and inspire novel scientific insights in large-scale omics studies. Bio-Copilot forms high-quality intensive intelligence through close collaboration between multiple agents, driven by large language models (LLMs), and human researchers. To augment the capabilities of Bio-Copilot, this study devises an agent group management strategy, an effective human-agent interaction mechanism, a shared interdisciplinary knowledge database, and continuous learning strategies for the agents. We comprehensively compare Bio-Copilot against GPT-4o and several leading AI agents across diverse bioinformatics tasks, using a broad range of evaluation metrics. Bio-Copilot achieves overall state-of-the-art performance across all tasks, while showcasing exceptional task completeness. Furthermore, on application to constructing a large-scale human lung cell atlas, Bio-Copilot not only reproduces the intricate data integration process detailed in a seminal study but also introduces a recursive, multilevel annotation strategy to capture the continuous nature of cellular states and uncovers the characteristics of rare cell types, highlighting its potential to unravel hidden complexities in biological systems. Beyond the technical achievements, this study also underscores the profound implications of integrating AI capabilities with expert knowledge in accelerating impactful biological discoveries and exploring uncharted territories.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 4","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12245162/pdf/","citationCount":"0","resultStr":"{\"title\":\"A data-intelligence-intensive bioinformatics copilot system for large-scale omics research and scientific insights.\",\"authors\":\"Yang Liu, Rongbo Shen, Lu Zhou, Qingyu Xiao, Jiao Yuan, Yixue Li\",\"doi\":\"10.1093/bib/bbaf312\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Advancements in high-throughput sequencing technologies and artificial intelligence (AI) offer unprecedented opportunities for groundbreaking discoveries in bioinformatics research. However, the challenges of exponential growth of omics data and the rapid development of AI technologies require automated big biological data analysis capability and interdisciplinary knowledge-driven scientific insight. Here, we propose a data-intelligence-intensive bioinformatics copilot (Bio-Copilot) system that synergizes AI capabilities with human researchers to facilitate hypothesis-free exploratory research and inspire novel scientific insights in large-scale omics studies. Bio-Copilot forms high-quality intensive intelligence through close collaboration between multiple agents, driven by large language models (LLMs), and human researchers. To augment the capabilities of Bio-Copilot, this study devises an agent group management strategy, an effective human-agent interaction mechanism, a shared interdisciplinary knowledge database, and continuous learning strategies for the agents. We comprehensively compare Bio-Copilot against GPT-4o and several leading AI agents across diverse bioinformatics tasks, using a broad range of evaluation metrics. Bio-Copilot achieves overall state-of-the-art performance across all tasks, while showcasing exceptional task completeness. Furthermore, on application to constructing a large-scale human lung cell atlas, Bio-Copilot not only reproduces the intricate data integration process detailed in a seminal study but also introduces a recursive, multilevel annotation strategy to capture the continuous nature of cellular states and uncovers the characteristics of rare cell types, highlighting its potential to unravel hidden complexities in biological systems. Beyond the technical achievements, this study also underscores the profound implications of integrating AI capabilities with expert knowledge in accelerating impactful biological discoveries and exploring uncharted territories.</p>\",\"PeriodicalId\":9209,\"journal\":{\"name\":\"Briefings in bioinformatics\",\"volume\":\"26 4\",\"pages\":\"\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2025-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12245162/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Briefings in bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/bib/bbaf312\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf312","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
A data-intelligence-intensive bioinformatics copilot system for large-scale omics research and scientific insights.
Advancements in high-throughput sequencing technologies and artificial intelligence (AI) offer unprecedented opportunities for groundbreaking discoveries in bioinformatics research. However, the challenges of exponential growth of omics data and the rapid development of AI technologies require automated big biological data analysis capability and interdisciplinary knowledge-driven scientific insight. Here, we propose a data-intelligence-intensive bioinformatics copilot (Bio-Copilot) system that synergizes AI capabilities with human researchers to facilitate hypothesis-free exploratory research and inspire novel scientific insights in large-scale omics studies. Bio-Copilot forms high-quality intensive intelligence through close collaboration between multiple agents, driven by large language models (LLMs), and human researchers. To augment the capabilities of Bio-Copilot, this study devises an agent group management strategy, an effective human-agent interaction mechanism, a shared interdisciplinary knowledge database, and continuous learning strategies for the agents. We comprehensively compare Bio-Copilot against GPT-4o and several leading AI agents across diverse bioinformatics tasks, using a broad range of evaluation metrics. Bio-Copilot achieves overall state-of-the-art performance across all tasks, while showcasing exceptional task completeness. Furthermore, on application to constructing a large-scale human lung cell atlas, Bio-Copilot not only reproduces the intricate data integration process detailed in a seminal study but also introduces a recursive, multilevel annotation strategy to capture the continuous nature of cellular states and uncovers the characteristics of rare cell types, highlighting its potential to unravel hidden complexities in biological systems. Beyond the technical achievements, this study also underscores the profound implications of integrating AI capabilities with expert knowledge in accelerating impactful biological discoveries and exploring uncharted territories.
期刊介绍:
Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data.
The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.