Adaptive multi-view learning method for enhanced drug repurposing using chemical-induced transcriptional profiles, knowledge graphs, and large language models.

IF 8.9

Journal of pharmaceutical analysis Pub Date : 2025-06-01 Epub Date: 2025-03-21 DOI:10.1016/j.jpha.2025.101275

Yudong Yan, Yinqi Yang, Zhuohao Tong, Yu Wang, Fan Yang, Zupeng Pan, Chuan Liu, Mingze Bai, Yongfang Xie, Yuefei Li, Kunxian Shu, Yinghong Li

{"title":"Adaptive multi-view learning method for enhanced drug repurposing using chemical-induced transcriptional profiles, knowledge graphs, and large language models.","authors":"Yudong Yan, Yinqi Yang, Zhuohao Tong, Yu Wang, Fan Yang, Zupeng Pan, Chuan Liu, Mingze Bai, Yongfang Xie, Yuefei Li, Kunxian Shu, Yinghong Li","doi":"10.1016/j.jpha.2025.101275","DOIUrl":null,"url":null,"abstract":"<p><p>Drug repurposing offers a promising alternative to traditional drug development and significantly reduces costs and timelines by identifying new therapeutic uses for existing drugs. However, the current approaches often rely on limited data sources and simplistic hypotheses, which restrict their ability to capture the multi-faceted nature of biological systems. This study introduces adaptive multi-view learning (AMVL), a novel methodology that integrates chemical-induced transcriptional profiles (CTPs), knowledge graph (KG) embeddings, and large language model (LLM) representations, to enhance drug repurposing predictions. AMVL incorporates an innovative similarity matrix expansion strategy and leverages multi-view learning (MVL), matrix factorization, and ensemble optimization techniques to integrate heterogeneous multi-source data. Comprehensive evaluations on benchmark datasets (Fdataset, Cdataset, and Ydataset) and the large-scale iDrug dataset demonstrate that AMVL outperforms state-of-the-art (SOTA) methods, achieving superior accuracy in predicting drug-disease associations across multiple metrics. Literature-based validation further confirmed the model's predictive capabilities, with seven out of the top ten predictions corroborated by post-2011 evidence. To promote transparency and reproducibility, all data and codes used in this study were open-sourced, providing resources for processing CTPs, KG, and LLM-based similarity calculations, along with the complete AMVL algorithm and benchmarking procedures. By unifying diverse data modalities, AMVL offers a robust and scalable solution for accelerating drug discovery, fostering advancements in translational medicine and integrating multi-omics data. We aim to inspire further innovations in multi-source data integration and support the development of more precise and efficient strategies for advancing drug discovery and translational medicine.</p>","PeriodicalId":94338,"journal":{"name":"Journal of pharmaceutical analysis","volume":"15 6","pages":"101275"},"PeriodicalIF":8.9000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12268076/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of pharmaceutical analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.jpha.2025.101275","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/21 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Drug repurposing offers a promising alternative to traditional drug development and significantly reduces costs and timelines by identifying new therapeutic uses for existing drugs. However, the current approaches often rely on limited data sources and simplistic hypotheses, which restrict their ability to capture the multi-faceted nature of biological systems. This study introduces adaptive multi-view learning (AMVL), a novel methodology that integrates chemical-induced transcriptional profiles (CTPs), knowledge graph (KG) embeddings, and large language model (LLM) representations, to enhance drug repurposing predictions. AMVL incorporates an innovative similarity matrix expansion strategy and leverages multi-view learning (MVL), matrix factorization, and ensemble optimization techniques to integrate heterogeneous multi-source data. Comprehensive evaluations on benchmark datasets (Fdataset, Cdataset, and Ydataset) and the large-scale iDrug dataset demonstrate that AMVL outperforms state-of-the-art (SOTA) methods, achieving superior accuracy in predicting drug-disease associations across multiple metrics. Literature-based validation further confirmed the model's predictive capabilities, with seven out of the top ten predictions corroborated by post-2011 evidence. To promote transparency and reproducibility, all data and codes used in this study were open-sourced, providing resources for processing CTPs, KG, and LLM-based similarity calculations, along with the complete AMVL algorithm and benchmarking procedures. By unifying diverse data modalities, AMVL offers a robust and scalable solution for accelerating drug discovery, fostering advancements in translational medicine and integrating multi-omics data. We aim to inspire further innovations in multi-source data integration and support the development of more precise and efficient strategies for advancing drug discovery and translational medicine.

查看原文本刊更多论文

利用化学诱导的转录谱、知识图谱和大型语言模型增强药物再利用的自适应多视图学习方法。

药物再利用为传统药物开发提供了一个有希望的替代方案，并通过确定现有药物的新治疗用途大大降低了成本和时间表。然而，目前的方法往往依赖于有限的数据来源和简单的假设，这限制了它们捕捉生物系统多面性的能力。本研究引入了自适应多视图学习（AMVL），这是一种集成了化学诱导转录谱（ctp）、知识图（KG）嵌入和大语言模型（LLM）表示的新方法，以增强药物重新利用的预测。AMVL采用了一种创新的相似矩阵展开策略，并利用多视图学习（MVL）、矩阵分解和集成优化技术来集成异构多源数据。对基准数据集（Fdataset， Cdataset和Ydataset）和大规模药物数据集的综合评估表明，AMVL优于最先进的（SOTA）方法，在跨多个指标预测药物-疾病关联方面取得了卓越的准确性。基于文献的验证进一步证实了该模型的预测能力，十大预测中有七个得到了2011年后证据的证实。为了提高透明度和可重复性，本研究中使用的所有数据和代码都是开源的，为处理基于ctp、KG和llm的相似性计算提供了资源，以及完整的AMVL算法和基准测试程序。通过统一不同的数据模式，AMVL为加速药物发现、促进转化医学的进步和整合多组学数据提供了一个强大的、可扩展的解决方案。我们的目标是激发多源数据集成的进一步创新，并支持开发更精确和有效的策略，以推进药物发现和转化医学。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of pharmaceutical analysis

自引率

0.00%

发文量