{"title":"A graph-based knowledge distillation framework for drug repurposing via multi-task learning","authors":"Zahra Alaeddini , Parham Moradi , Bahram Sadeghi Bigham","doi":"10.1016/j.engappai.2025.112752","DOIUrl":null,"url":null,"abstract":"<div><div>Biomedical Knowledge Graphs (BKGs) capture intricate interactions between biological entities, playing a crucial role in the repurposing of drugs. However, current BKG completion methods often face challenges in scalability, predictive performance, and computational efficiency. We propose a novel Graph-based Knowledge Distillation approach for Drug Repurposing via a Multi-Task Learning framework (GKDRMTL), to address these limitations. By leveraging a teacher–student knowledge distillation strategy, our model not only enhances predictive accuracy but also substantially reduces computational demands. Compared to the state-of-the-art baselines, the student model consistently demonstrates substantial efficiency gains, achieving ∼30–93 % faster training time per epoch, ∼75–99 % lower memory usage, ∼46–88 % faster inference time, while maintaining competitive accuracy. Evaluated on an extended HetioNet, a heterogeneous biomedical knowledge graph, GKDRMTL reached state-of-the-art results across multiple link prediction tasks, including drug–disease associations, drug-drug similarity, disease-disease similarity, and disease–gene associations. The teacher achieves near-perfect performance in Area under the Receiver Operating Characteristic Curve (ROC-AUC) of 0.9889, Area Under the Precision-Recall Curve (AUPR) of 0.9875, and Accuracy of 0.9876. While the student approximates teacher performance with ROC-AUC of 0.9739, AUPR of 0.9704, and Accuracy of 0.9673, despite its simplified architecture. These findings underscore the importance of integrating knowledge distillation with multi-task learning for efficient and high-performance biomedical link prediction. The code of the proposed method and data are available here: <span><span>https://github.com/Zahra-Alaeddini/GKDRMTL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"162 ","pages":"Article 112752"},"PeriodicalIF":8.0000,"publicationDate":"2025-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625027836","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Biomedical Knowledge Graphs (BKGs) capture intricate interactions between biological entities, playing a crucial role in the repurposing of drugs. However, current BKG completion methods often face challenges in scalability, predictive performance, and computational efficiency. We propose a novel Graph-based Knowledge Distillation approach for Drug Repurposing via a Multi-Task Learning framework (GKDRMTL), to address these limitations. By leveraging a teacher–student knowledge distillation strategy, our model not only enhances predictive accuracy but also substantially reduces computational demands. Compared to the state-of-the-art baselines, the student model consistently demonstrates substantial efficiency gains, achieving ∼30–93 % faster training time per epoch, ∼75–99 % lower memory usage, ∼46–88 % faster inference time, while maintaining competitive accuracy. Evaluated on an extended HetioNet, a heterogeneous biomedical knowledge graph, GKDRMTL reached state-of-the-art results across multiple link prediction tasks, including drug–disease associations, drug-drug similarity, disease-disease similarity, and disease–gene associations. The teacher achieves near-perfect performance in Area under the Receiver Operating Characteristic Curve (ROC-AUC) of 0.9889, Area Under the Precision-Recall Curve (AUPR) of 0.9875, and Accuracy of 0.9876. While the student approximates teacher performance with ROC-AUC of 0.9739, AUPR of 0.9704, and Accuracy of 0.9673, despite its simplified architecture. These findings underscore the importance of integrating knowledge distillation with multi-task learning for efficient and high-performance biomedical link prediction. The code of the proposed method and data are available here: https://github.com/Zahra-Alaeddini/GKDRMTL.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.