DrugReAlign：基于大型语言模型的药物再利用多源提示框架。

IF 4.4 1区生物学 Q1 BIOLOGY

BMC Biology Pub Date : 2024-10-08 DOI:10.1186/s12915-024-02028-3

Jinhang Wei, Linlin Zhuo, Xiangzheng Fu, XiangXiang Zeng, Li Wang, Quan Zou, Dongsheng Cao

{"title":"DrugReAlign：基于大型语言模型的药物再利用多源提示框架。","authors":"Jinhang Wei, Linlin Zhuo, Xiangzheng Fu, XiangXiang Zeng, Li Wang, Quan Zou, Dongsheng Cao","doi":"10.1186/s12915-024-02028-3","DOIUrl":null,"url":null,"abstract":"Drug repurposing is a promising approach in the field of drug discovery owing to its efficiency and cost-effectiveness. Most current drug repurposing models rely on specific datasets for training, which limits their predictive accuracy and scope. The number of both market-approved and experimental drugs is vast, forming an extensive molecular space. Due to limitations in parameter size and data volume, traditional drug-target interaction (DTI) prediction models struggle to generalize well within such a broad space. In contrast, large language models (LLMs), with their vast parameter sizes and extensive training data, demonstrate certain advantages in drug repurposing tasks. In our research, we introduce a novel drug repurposing framework, DrugReAlign, based on LLMs and multi-source prompt techniques, designed to fully exploit the potential of existing drugs efficiently. Leveraging LLMs, the DrugReAlign framework acquires general knowledge about targets and drugs from extensive human knowledge bases, overcoming the data availability limitations of traditional approaches. Furthermore, we collected target summaries and target-drug space interaction data from databases as multi-source prompts, substantially improving LLM performance in drug repurposing. We validated the efficiency and reliability of the proposed framework through molecular docking and DTI datasets. Significantly, our findings suggest a direct correlation between the accuracy of LLMs' target analysis and the quality of prediction outcomes. These findings signify that the proposed framework holds the promise of inaugurating a new paradigm in drug repurposing.","PeriodicalId":9339,"journal":{"name":"BMC Biology","volume":"22 1","pages":"226"},"PeriodicalIF":4.4000,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11463036/pdf/","citationCount":"0","resultStr":"{\"title\":\"DrugReAlign: a multisource prompt framework for drug repurposing based on large language models.\",\"authors\":\"Jinhang Wei, Linlin Zhuo, Xiangzheng Fu, XiangXiang Zeng, Li Wang, Quan Zou, Dongsheng Cao\",\"doi\":\"10.1186/s12915-024-02028-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Drug repurposing is a promising approach in the field of drug discovery owing to its efficiency and cost-effectiveness. Most current drug repurposing models rely on specific datasets for training, which limits their predictive accuracy and scope. The number of both market-approved and experimental drugs is vast, forming an extensive molecular space. Due to limitations in parameter size and data volume, traditional drug-target interaction (DTI) prediction models struggle to generalize well within such a broad space. In contrast, large language models (LLMs), with their vast parameter sizes and extensive training data, demonstrate certain advantages in drug repurposing tasks. In our research, we introduce a novel drug repurposing framework, DrugReAlign, based on LLMs and multi-source prompt techniques, designed to fully exploit the potential of existing drugs efficiently. Leveraging LLMs, the DrugReAlign framework acquires general knowledge about targets and drugs from extensive human knowledge bases, overcoming the data availability limitations of traditional approaches. Furthermore, we collected target summaries and target-drug space interaction data from databases as multi-source prompts, substantially improving LLM performance in drug repurposing. We validated the efficiency and reliability of the proposed framework through molecular docking and DTI datasets. Significantly, our findings suggest a direct correlation between the accuracy of LLMs' target analysis and the quality of prediction outcomes. These findings signify that the proposed framework holds the promise of inaugurating a new paradigm in drug repurposing.\",\"PeriodicalId\":9339,\"journal\":{\"name\":\"BMC Biology\",\"volume\":\"22 1\",\"pages\":\"226\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2024-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11463036/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12915-024-02028-3\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12915-024-02028-3","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

药物再利用因其高效性和成本效益而成为药物发现领域一种前景广阔的方法。目前大多数药物再利用模型都依赖于特定的数据集进行训练，这限制了其预测的准确性和范围。市场批准的药物和实验药物数量庞大，形成了一个广阔的分子空间。由于参数大小和数据量的限制，传统的药物-靶点相互作用（DTI）预测模型很难在如此广阔的空间内很好地泛化。相比之下，大语言模型（LLM）具有庞大的参数规模和大量的训练数据，在药物再利用任务中显示出一定的优势。在我们的研究中，我们介绍了一种基于 LLMs 和多源提示技术的新型药物再利用框架--DrugReAlign，旨在高效地充分挖掘现有药物的潜力。利用 LLMs，DrugReAlign 框架从广泛的人类知识库中获取有关靶点和药物的一般知识，克服了传统方法的数据可用性限制。此外，我们还从数据库中收集了靶点摘要和靶点-药物空间相互作用数据作为多源提示，大大提高了 LLM 在药物再利用中的性能。我们通过分子对接和 DTI 数据集验证了拟议框架的效率和可靠性。值得注意的是，我们的研究结果表明，LLM 目标分析的准确性与预测结果的质量直接相关。这些发现表明，所提出的框架有望开创药物再利用的新模式。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DrugReAlign: a multisource prompt framework for drug repurposing based on large language models.

Drug repurposing is a promising approach in the field of drug discovery owing to its efficiency and cost-effectiveness. Most current drug repurposing models rely on specific datasets for training, which limits their predictive accuracy and scope. The number of both market-approved and experimental drugs is vast, forming an extensive molecular space. Due to limitations in parameter size and data volume, traditional drug-target interaction (DTI) prediction models struggle to generalize well within such a broad space. In contrast, large language models (LLMs), with their vast parameter sizes and extensive training data, demonstrate certain advantages in drug repurposing tasks. In our research, we introduce a novel drug repurposing framework, DrugReAlign, based on LLMs and multi-source prompt techniques, designed to fully exploit the potential of existing drugs efficiently. Leveraging LLMs, the DrugReAlign framework acquires general knowledge about targets and drugs from extensive human knowledge bases, overcoming the data availability limitations of traditional approaches. Furthermore, we collected target summaries and target-drug space interaction data from databases as multi-source prompts, substantially improving LLM performance in drug repurposing. We validated the efficiency and reliability of the proposed framework through molecular docking and DTI datasets. Significantly, our findings suggest a direct correlation between the accuracy of LLMs' target analysis and the quality of prediction outcomes. These findings signify that the proposed framework holds the promise of inaugurating a new paradigm in drug repurposing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BMC Biology 生物-生物学

CiteScore

7.80

自引率

1.90%

发文量

260

审稿时长

3 months

期刊介绍： BMC Biology is a broad scope journal covering all areas of biology. Our content includes research articles, new methods and tools. BMC Biology also publishes reviews, Q&A, and commentaries.