MGDRGCN: A novel framework for predicting metabolite–disease connections using tripartite network and relational graph convolutional network

IF 3.1 3区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Journal of Computational Science Pub Date : 2025-02-01 DOI:10.1016/j.jocs.2024.102477

Pengli Lu, Ling Li

{"title":"MGDRGCN: A novel framework for predicting metabolite–disease connections using tripartite network and relational graph convolutional network","authors":"Pengli Lu, Ling Li","doi":"10.1016/j.jocs.2024.102477","DOIUrl":null,"url":null,"abstract":"<div><div>Metabolites are fundamental to the existence of biomolecules, and numerous studies have demonstrated that uncovering the connections between metabolites and diseases can enhance our understanding of disease pathogenesis. Traditional biological methods can identify potential metabolite–disease relationships, but these approaches often require significant human and material resources. Consequently, computational methods have emerged as a more efficient alternative. However, most computational methods primarily rely on metabolite–disease associations and rarely explore the impact of more biological entities. To address this issue, we propose a novel computational framework based on a metabolite–gene–disease tripartite heterogeneous network and relational graph convolutional network (R-GCN), abbreviated as MGDRGCN. Specifically, we construct three types of similarity networks from multiple data sources, including metabolite and gene functional similarity, disease semantic similarity and Gaussian interaction profile kernel similarity for metabolites and diseases. Next, we use principal component analysis to further extract features and construct a tripartite heterogeneous network with genes as the bridge. This network structure comprehensively captures and represents the complex relationships among metabolites, genes and diseases. We employ R-GCN to extract higher-order information from the tripartite heterogeneous network. Finally, we input the embeddings learned from R-GCN into a residual network classifier to predict potential metabolite–disease associations. In five-fold cross-validation experiments, MGDRGCN exhibit outstanding performance, with both AUC (0.9866) and AUPR (0.9865) significantly surpassing other advanced methods. Additionally, case studies further demonstrate MGDRGCN’s superior performance in predicting metabolite–disease associations. Overall, the introduction of MGDRGCN provides new perspectives and methods for future biomedical research, offering promising potential for uncovering the mechanisms of complex biological systems.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102477"},"PeriodicalIF":3.1000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Science","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1877750324002709","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Metabolites are fundamental to the existence of biomolecules, and numerous studies have demonstrated that uncovering the connections between metabolites and diseases can enhance our understanding of disease pathogenesis. Traditional biological methods can identify potential metabolite–disease relationships, but these approaches often require significant human and material resources. Consequently, computational methods have emerged as a more efficient alternative. However, most computational methods primarily rely on metabolite–disease associations and rarely explore the impact of more biological entities. To address this issue, we propose a novel computational framework based on a metabolite–gene–disease tripartite heterogeneous network and relational graph convolutional network (R-GCN), abbreviated as MGDRGCN. Specifically, we construct three types of similarity networks from multiple data sources, including metabolite and gene functional similarity, disease semantic similarity and Gaussian interaction profile kernel similarity for metabolites and diseases. Next, we use principal component analysis to further extract features and construct a tripartite heterogeneous network with genes as the bridge. This network structure comprehensively captures and represents the complex relationships among metabolites, genes and diseases. We employ R-GCN to extract higher-order information from the tripartite heterogeneous network. Finally, we input the embeddings learned from R-GCN into a residual network classifier to predict potential metabolite–disease associations. In five-fold cross-validation experiments, MGDRGCN exhibit outstanding performance, with both AUC (0.9866) and AUPR (0.9865) significantly surpassing other advanced methods. Additionally, case studies further demonstrate MGDRGCN’s superior performance in predicting metabolite–disease associations. Overall, the introduction of MGDRGCN provides new perspectives and methods for future biomedical research, offering promising potential for uncovering the mechanisms of complex biological systems.

查看原文本刊更多论文

MGDRGCN：一个使用三方网络和关系图卷积网络预测代谢物-疾病联系的新框架

代谢物是生物分子存在的基础，大量研究表明，发现代谢物与疾病之间的联系可以增强我们对疾病发病机制的理解。传统的生物学方法可以识别潜在的代谢物与疾病的关系，但这些方法往往需要大量的人力和物力资源。因此，计算方法作为一种更有效的替代方法出现了。然而，大多数计算方法主要依赖于代谢物-疾病关联，很少探索更多生物实体的影响。为了解决这个问题，我们提出了一个基于代谢物-基因-疾病三方异构网络和关系图卷积网络（R-GCN）的新型计算框架，简称为MGDRGCN。具体而言，我们从多个数据源构建了代谢物与基因功能相似、疾病语义相似和代谢物与疾病高斯相互作用谱核相似三种类型的相似网络。接下来，利用主成分分析进一步提取特征，构建以基因为桥梁的三方异构网络。这种网络结构全面捕捉并代表了代谢物、基因和疾病之间的复杂关系。我们使用R-GCN从三方异构网络中提取高阶信息。最后，我们将从R-GCN学习到的嵌入输入到残差网络分类器中，以预测潜在的代谢物-疾病关联。在五重交叉验证实验中，MGDRGCN表现出优异的性能，AUC（0.9866）和AUPR（0.9865）均显著优于其他先进方法。此外，案例研究进一步证明了MGDRGCN在预测代谢物疾病关联方面的优越性能。总的来说，MGDRGCN的引入为未来生物医学研究提供了新的视角和方法，为揭示复杂生物系统的机制提供了广阔的前景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Computational Science COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-COMPUTER SCIENCE, THEORY & METHODS

CiteScore

5.50

自引率

3.00%

发文量

227

审稿时长

41 days

期刊介绍： Computational Science is a rapidly growing multi- and interdisciplinary field that uses advanced computing and data analysis to understand and solve complex problems. It has reached a level of predictive capability that now firmly complements the traditional pillars of experimentation and theory. The recent advances in experimental techniques such as detectors, on-line sensor networks and high-resolution imaging techniques, have opened up new windows into physical and biological processes at many levels of detail. The resulting data explosion allows for detailed data driven modeling and simulation. This new discipline in science combines computational thinking, modern computational methods, devices and collateral technologies to address problems far beyond the scope of traditional numerical methods. Computational science typically unifies three distinct elements: • Modeling, Algorithms and Simulations (e.g. numerical and non-numerical, discrete and continuous); • Software developed to solve science (e.g., biological, physical, and social), engineering, medicine, and humanities problems; • Computer and information science that develops and optimizes the advanced system hardware, software, networking, and data management components (e.g. problem solving environments).