Rag2Mol: structure-based drug design based on retrieval augmented generation.

IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Peidong Zhang, Xingang Peng, Rong Han, Ting Chen, Jianzhu Ma
{"title":"Rag2Mol: structure-based drug design based on retrieval augmented generation.","authors":"Peidong Zhang, Xingang Peng, Rong Han, Ting Chen, Jianzhu Ma","doi":"10.1093/bib/bbaf265","DOIUrl":null,"url":null,"abstract":"<p><p>Artificial intelligence (AI) has brought tremendous progress to drug discovery, yet identifying hit and lead compounds with optimal physicochemical and pharmacological properties remains a significant challenge. Structure-based drug design (SBDD) has emerged as a promising paradigm, but the inherent data biases and ignorance of synthetic accessibility render SBDD models disconnected from practical drug discovery. In this work, we explore two methodologies, Rag2Mol-G and Rag2Mol-R, both based on retrieval-augmented generation to design small molecules to fit a 3D pocket. These two methods involve searching for similar small molecules that are purchasable in the database based on the generated ones or creating new molecules from those in the database that can fit into a 3D pocket. Experimental results demonstrate that Rag2Mol methods consistently produce drug candidates with superior binding affinities and drug-likeness. We find that Rag2Mol-R provides a broader coverage of the chemical landscapes and more precise targeting capability than advanced virtual screening models. Notably, both workflows identified promising inhibitors for the challenging target protein tyrosine phosphatases PTPN2, which was used to be considered undruggable and still lacks inhibitors that have completed full clinical trials. Our highly extensible framework can integrate diverse SBDD methods, marking a significant advancement in AI-driven SBDD.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 3","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12159289/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf265","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Artificial intelligence (AI) has brought tremendous progress to drug discovery, yet identifying hit and lead compounds with optimal physicochemical and pharmacological properties remains a significant challenge. Structure-based drug design (SBDD) has emerged as a promising paradigm, but the inherent data biases and ignorance of synthetic accessibility render SBDD models disconnected from practical drug discovery. In this work, we explore two methodologies, Rag2Mol-G and Rag2Mol-R, both based on retrieval-augmented generation to design small molecules to fit a 3D pocket. These two methods involve searching for similar small molecules that are purchasable in the database based on the generated ones or creating new molecules from those in the database that can fit into a 3D pocket. Experimental results demonstrate that Rag2Mol methods consistently produce drug candidates with superior binding affinities and drug-likeness. We find that Rag2Mol-R provides a broader coverage of the chemical landscapes and more precise targeting capability than advanced virtual screening models. Notably, both workflows identified promising inhibitors for the challenging target protein tyrosine phosphatases PTPN2, which was used to be considered undruggable and still lacks inhibitors that have completed full clinical trials. Our highly extensible framework can integrate diverse SBDD methods, marking a significant advancement in AI-driven SBDD.

Rag2Mol:基于检索增强生成的基于结构的药物设计。
人工智能(AI)为药物发现带来了巨大的进步,但识别具有最佳物理化学和药理学特性的先导化合物仍然是一个重大挑战。基于结构的药物设计(SBDD)已经成为一种很有前途的范式,但固有的数据偏差和对合成可及性的无知使得SBDD模型与实际药物发现脱节。在这项工作中,我们探索了Rag2Mol-G和Rag2Mol-R两种方法,这两种方法都是基于检索增强生成来设计适合3D口袋的小分子。这两种方法包括根据生成的小分子在数据库中搜索可购买的类似小分子,或者从数据库中创建可以放入3D口袋的新分子。实验结果表明,Rag2Mol方法始终产生具有良好结合亲和力和药物相似性的候选药物。我们发现Rag2Mol-R比先进的虚拟筛选模型提供了更广泛的化学景观覆盖范围和更精确的靶向能力。值得注意的是,这两个工作流程都确定了具有挑战性的靶蛋白酪氨酸磷酸酶PTPN2的有希望的抑制剂,PTPN2过去被认为是不可药物的,仍然缺乏完成完整临床试验的抑制剂。我们高度可扩展的框架可以集成多种SBDD方法,标志着人工智能驱动的SBDD取得了重大进展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Briefings in bioinformatics
Briefings in bioinformatics 生物-生化研究方法
CiteScore
13.20
自引率
13.70%
发文量
549
审稿时长
6 months
期刊介绍: Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信