Heterogeneous graph collaborative representation learning for drug-related microbe prediction with attentive fusion and reciprocal distillation

IF 7.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2025-09-25 DOI:10.1016/j.knosys.2025.114548

Yanbu Guo , Quanming Guo , Shengli Song , Yihan Wang , Jinde Cao

{"title":"Heterogeneous graph collaborative representation learning for drug-related microbe prediction with attentive fusion and reciprocal distillation","authors":"Yanbu Guo , Quanming Guo , Shengli Song , Yihan Wang , Jinde Cao","doi":"10.1016/j.knosys.2025.114548","DOIUrl":null,"url":null,"abstract":"<div><div>Microbes are microorganisms with biological molecules and have significant therapeutic potential for treating diseases, underscoring the need for computational methods to screen microbes targeting disease-associated drugs. However, the computational methods often consider node embedding or structure features between microbes and drugs, and have a severe class imbalance problem inherent in sparse association data. In this work, we proposed a heterogeneous graph collaborative representation learning model that combines the merits of attentive fusion and reciprocal distillation for drug-related microbe prediction. First, we constructed the heterogeneous biological information and meta-path-induced graphs of microbes and drugs. Then, a topological structure feature encoder is devised to extract complex topological and semantic interaction patterns from heterogeneous biological graphs with microbes and drugs, while an efficient transformer concurrently extracts discriminative semantic and structural information based on the graph position information of nodes. Next, a reciprocal distillation schema is developed to mitigate the adverse effects of the data imbalance problem, and enable the distribution consistency of the model between topological and semantic information extraction. Moreover, we devised a dual collaborative feature fusion schema that combines graph topological and dual meta-path-based semantic features to obtain the discriminative features of microbes and drugs. Through reciprocal distillation, an efficient optimization function focuses on hard-to-classify samples of drug-related microbes via discriminative features. Extensive experiments demonstrate that our model could deal with the association sparsity problem and extract more semantics and structure. Meanwhile, case studies indicate that our model could discover reliable candidate microbes associated with a special drug.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114548"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125015874","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Microbes are microorganisms with biological molecules and have significant therapeutic potential for treating diseases, underscoring the need for computational methods to screen microbes targeting disease-associated drugs. However, the computational methods often consider node embedding or structure features between microbes and drugs, and have a severe class imbalance problem inherent in sparse association data. In this work, we proposed a heterogeneous graph collaborative representation learning model that combines the merits of attentive fusion and reciprocal distillation for drug-related microbe prediction. First, we constructed the heterogeneous biological information and meta-path-induced graphs of microbes and drugs. Then, a topological structure feature encoder is devised to extract complex topological and semantic interaction patterns from heterogeneous biological graphs with microbes and drugs, while an efficient transformer concurrently extracts discriminative semantic and structural information based on the graph position information of nodes. Next, a reciprocal distillation schema is developed to mitigate the adverse effects of the data imbalance problem, and enable the distribution consistency of the model between topological and semantic information extraction. Moreover, we devised a dual collaborative feature fusion schema that combines graph topological and dual meta-path-based semantic features to obtain the discriminative features of microbes and drugs. Through reciprocal distillation, an efficient optimization function focuses on hard-to-classify samples of drug-related microbes via discriminative features. Extensive experiments demonstrate that our model could deal with the association sparsity problem and extract more semantics and structure. Meanwhile, case studies indicate that our model could discover reliable candidate microbes associated with a special drug.

查看原文本刊更多论文

基于注意融合和互反蒸馏的药物相关微生物预测异构图协同表示学习

微生物是具有生物分子的微生物，在治疗疾病方面具有重要的治疗潜力，因此需要计算方法来筛选针对疾病相关药物的微生物。然而，计算方法往往考虑微生物和药物之间的节点嵌入或结构特征，存在稀疏关联数据固有的严重类不平衡问题。在这项工作中，我们提出了一种异构图协同表示学习模型，该模型结合了注意融合和互惠蒸馏的优点，用于药物相关微生物预测。首先，我们构建了微生物和药物的异质性生物信息和元路径诱导图。然后，设计了拓扑结构特征编码器，用于从含有微生物和药物的异构生物图中提取复杂的拓扑和语义交互模式，而高效转换器则基于图中节点的位置信息同时提取判别语义和结构信息。其次，开发了一种互反蒸馏模式，以减轻数据不平衡问题的不利影响，并使模型在拓扑信息提取和语义信息提取之间的分布一致性。此外，我们设计了一种结合图拓扑和基于元路径的双语义特征的双协同特征融合模式，以获得微生物和药物的鉴别特征。通过互反蒸馏，通过判别特征对难以分类的药物相关微生物样品进行有效的优化。大量的实验表明，我们的模型可以处理关联稀疏性问题，提取更多的语义和结构。同时，案例研究表明，我们的模型可以发现与特殊药物相关的可靠候选微生物。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.