GRLGRN: graph representation-based learning to infer gene regulatory networks from single-cell RNA-seq data.

IF 2.9 3区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS
Kai Wang, Yulong Li, Fei Liu, Xiaoli Luan, Xinglong Wang, Jingwen Zhou
{"title":"GRLGRN: graph representation-based learning to infer gene regulatory networks from single-cell RNA-seq data.","authors":"Kai Wang, Yulong Li, Fei Liu, Xiaoli Luan, Xinglong Wang, Jingwen Zhou","doi":"10.1186/s12859-025-06116-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>A gene regulatory network (GRN) is a graph-level representation that describes the regulatory relationships between transcription factors and target genes in cells. The reconstruction of GRNs can help investigate cellular dynamics, drug design, and metabolic systems, and the rapid development of single-cell RNA sequencing (scRNA-seq) technology provides important opportunities while posing significant challenges for reconstructing GRNs. A number of methods for inferring GRNs have been proposed in recent years based on traditional machine learning and deep learning algorithms. However, inferring the GRN from scRNA-seq data remains challenging owing to cellular heterogeneity, measurement noise, and data dropout.</p><p><strong>Results: </strong>In this study, we propose a deep learning model called graph representational learning GRN (GRLGRN) to infer the latent regulatory dependencies between genes based on a prior GRN and data on the profiles of single-cell gene expressions. GRLGRN uses a graph transformer network to extract implicit links from the prior GRN, and encodes the features of genes by using both an adjacency matrix of implicit links and a matrix of the profile of gene expression. Moreover, it uses attention mechanisms to improve feature extraction, and feeds the refined gene embeddings into an output module to infer gene regulatory relationships. To evaluate the performance of GRLGRN, we compared it with prevalent models and performed ablation experiments on seven cell-line datasets with three ground-truth networks. The results showed that GRLGRN achieved the best predictions in AUROC and AUPRC on 78.6% and 80.9% of the datasets, and achieved an average improvement of 7.3% in AUROC and 30.7% in AUPRC. The interpretation discussion and the network visualization were conducted.</p><p><strong>Conclusions: </strong>The experimental results and case studies illustrate the considerable performance of GRLGRN in predicting gene interactions and provide interpretability for the prediction tasks, such as identifying hub genes in the network and uncovering implicit links.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"108"},"PeriodicalIF":2.9000,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12008888/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06116-1","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: A gene regulatory network (GRN) is a graph-level representation that describes the regulatory relationships between transcription factors and target genes in cells. The reconstruction of GRNs can help investigate cellular dynamics, drug design, and metabolic systems, and the rapid development of single-cell RNA sequencing (scRNA-seq) technology provides important opportunities while posing significant challenges for reconstructing GRNs. A number of methods for inferring GRNs have been proposed in recent years based on traditional machine learning and deep learning algorithms. However, inferring the GRN from scRNA-seq data remains challenging owing to cellular heterogeneity, measurement noise, and data dropout.

Results: In this study, we propose a deep learning model called graph representational learning GRN (GRLGRN) to infer the latent regulatory dependencies between genes based on a prior GRN and data on the profiles of single-cell gene expressions. GRLGRN uses a graph transformer network to extract implicit links from the prior GRN, and encodes the features of genes by using both an adjacency matrix of implicit links and a matrix of the profile of gene expression. Moreover, it uses attention mechanisms to improve feature extraction, and feeds the refined gene embeddings into an output module to infer gene regulatory relationships. To evaluate the performance of GRLGRN, we compared it with prevalent models and performed ablation experiments on seven cell-line datasets with three ground-truth networks. The results showed that GRLGRN achieved the best predictions in AUROC and AUPRC on 78.6% and 80.9% of the datasets, and achieved an average improvement of 7.3% in AUROC and 30.7% in AUPRC. The interpretation discussion and the network visualization were conducted.

Conclusions: The experimental results and case studies illustrate the considerable performance of GRLGRN in predicting gene interactions and provide interpretability for the prediction tasks, such as identifying hub genes in the network and uncovering implicit links.

GRLGRN:基于图表示的学习,从单细胞RNA-seq数据推断基因调控网络。
背景:基因调控网络(GRN)是描述细胞中转录因子与靶基因之间调控关系的图形表示。grn的重建可以帮助研究细胞动力学、药物设计和代谢系统,单细胞RNA测序(scRNA-seq)技术的快速发展为grn的重建提供了重要的机会,同时也提出了重大的挑战。近年来,在传统机器学习和深度学习算法的基础上,提出了许多推断grn的方法。然而,由于细胞异质性、测量噪声和数据丢失,从scRNA-seq数据推断GRN仍然具有挑战性。结果:在本研究中,我们提出了一种深度学习模型,称为图形表示学习GRN (GRLGRN),该模型基于先前的GRN和单细胞基因表达谱数据推断基因之间的潜在调节依赖关系。GRLGRN使用图变换网络从先前的GRN中提取隐式链接,并利用隐式链接的邻接矩阵和基因表达谱矩阵对基因特征进行编码。此外,该方法利用注意机制改进特征提取,并将改进后的基因嵌入信息输入输出模块,以推断基因调控关系。为了评估GRLGRN的性能,我们将其与流行的模型进行了比较,并在七个细胞系数据集上进行了消融实验。结果表明,GRLGRN在AUROC和AUPRC上的预测准确率分别为78.6%和80.9%,在AUROC和AUPRC上的平均准确率分别为7.3%和30.7%。进行了解译讨论和网络可视化。结论:实验结果和案例研究表明,GRLGRN在预测基因相互作用方面具有相当大的性能,并为预测任务提供了可解释性,例如识别网络中的枢纽基因和揭示隐含链接。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
BMC Bioinformatics
BMC Bioinformatics 生物-生化研究方法
CiteScore
5.70
自引率
3.30%
发文量
506
审稿时长
4.3 months
期刊介绍: BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信