GRLGRN: graph representation-based learning to infer gene regulatory networks from single-cell RNA-seq data.

IF 2.9 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics Pub Date : 2025-04-18 DOI:10.1186/s12859-025-06116-1

Kai Wang, Yulong Li, Fei Liu, Xiaoli Luan, Xinglong Wang, Jingwen Zhou

{"title":"GRLGRN: graph representation-based learning to infer gene regulatory networks from single-cell RNA-seq data.","authors":"Kai Wang, Yulong Li, Fei Liu, Xiaoli Luan, Xinglong Wang, Jingwen Zhou","doi":"10.1186/s12859-025-06116-1","DOIUrl":null,"url":null,"abstract":"Background: A gene regulatory network (GRN) is a graph-level representation that describes the regulatory relationships between transcription factors and target genes in cells. The reconstruction of GRNs can help investigate cellular dynamics, drug design, and metabolic systems, and the rapid development of single-cell RNA sequencing (scRNA-seq) technology provides important opportunities while posing significant challenges for reconstructing GRNs. A number of methods for inferring GRNs have been proposed in recent years based on traditional machine learning and deep learning algorithms. However, inferring the GRN from scRNA-seq data remains challenging owing to cellular heterogeneity, measurement noise, and data dropout.Results: In this study, we propose a deep learning model called graph representational learning GRN (GRLGRN) to infer the latent regulatory dependencies between genes based on a prior GRN and data on the profiles of single-cell gene expressions. GRLGRN uses a graph transformer network to extract implicit links from the prior GRN, and encodes the features of genes by using both an adjacency matrix of implicit links and a matrix of the profile of gene expression. Moreover, it uses attention mechanisms to improve feature extraction, and feeds the refined gene embeddings into an output module to infer gene regulatory relationships. To evaluate the performance of GRLGRN, we compared it with prevalent models and performed ablation experiments on seven cell-line datasets with three ground-truth networks. The results showed that GRLGRN achieved the best predictions in AUROC and AUPRC on 78.6% and 80.9% of the datasets, and achieved an average improvement of 7.3% in AUROC and 30.7% in AUPRC. The interpretation discussion and the network visualization were conducted.Conclusions: The experimental results and case studies illustrate the considerable performance of GRLGRN in predicting gene interactions and provide interpretability for the prediction tasks, such as identifying hub genes in the network and uncovering implicit links.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"108"},"PeriodicalIF":2.9000,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12008888/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06116-1","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: A gene regulatory network (GRN) is a graph-level representation that describes the regulatory relationships between transcription factors and target genes in cells. The reconstruction of GRNs can help investigate cellular dynamics, drug design, and metabolic systems, and the rapid development of single-cell RNA sequencing (scRNA-seq) technology provides important opportunities while posing significant challenges for reconstructing GRNs. A number of methods for inferring GRNs have been proposed in recent years based on traditional machine learning and deep learning algorithms. However, inferring the GRN from scRNA-seq data remains challenging owing to cellular heterogeneity, measurement noise, and data dropout.

Results: In this study, we propose a deep learning model called graph representational learning GRN (GRLGRN) to infer the latent regulatory dependencies between genes based on a prior GRN and data on the profiles of single-cell gene expressions. GRLGRN uses a graph transformer network to extract implicit links from the prior GRN, and encodes the features of genes by using both an adjacency matrix of implicit links and a matrix of the profile of gene expression. Moreover, it uses attention mechanisms to improve feature extraction, and feeds the refined gene embeddings into an output module to infer gene regulatory relationships. To evaluate the performance of GRLGRN, we compared it with prevalent models and performed ablation experiments on seven cell-line datasets with three ground-truth networks. The results showed that GRLGRN achieved the best predictions in AUROC and AUPRC on 78.6% and 80.9% of the datasets, and achieved an average improvement of 7.3% in AUROC and 30.7% in AUPRC. The interpretation discussion and the network visualization were conducted.

Conclusions: The experimental results and case studies illustrate the considerable performance of GRLGRN in predicting gene interactions and provide interpretability for the prediction tasks, such as identifying hub genes in the network and uncovering implicit links.

查看原文本刊更多论文

GRLGRN：基于图表示的学习，从单细胞RNA-seq数据推断基因调控网络。

背景：基因调控网络（GRN）是描述细胞中转录因子与靶基因之间调控关系的图形表示。grn的重建可以帮助研究细胞动力学、药物设计和代谢系统，单细胞RNA测序（scRNA-seq）技术的快速发展为grn的重建提供了重要的机会，同时也提出了重大的挑战。近年来，在传统机器学习和深度学习算法的基础上，提出了许多推断grn的方法。然而，由于细胞异质性、测量噪声和数据丢失，从scRNA-seq数据推断GRN仍然具有挑战性。结果：在本研究中，我们提出了一种深度学习模型，称为图形表示学习GRN (GRLGRN)，该模型基于先前的GRN和单细胞基因表达谱数据推断基因之间的潜在调节依赖关系。GRLGRN使用图变换网络从先前的GRN中提取隐式链接，并利用隐式链接的邻接矩阵和基因表达谱矩阵对基因特征进行编码。此外，该方法利用注意机制改进特征提取，并将改进后的基因嵌入信息输入输出模块，以推断基因调控关系。为了评估GRLGRN的性能，我们将其与流行的模型进行了比较，并在七个细胞系数据集上进行了消融实验。结果表明，GRLGRN在AUROC和AUPRC上的预测准确率分别为78.6%和80.9%，在AUROC和AUPRC上的平均准确率分别为7.3%和30.7%。进行了解译讨论和网络可视化。结论：实验结果和案例研究表明，GRLGRN在预测基因相互作用方面具有相当大的性能，并为预测任务提供了可解释性，例如识别网络中的枢纽基因和揭示隐含链接。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMC Bioinformatics 生物-生化研究方法

CiteScore

5.70

自引率

3.30%

发文量

506

审稿时长

4.3 months

期刊介绍： BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.