使用scRNA-Seq数据建模、推断和评估基因调控网络的方法学过程的调查

IF 2 4区 生物学 Q2 BIOLOGY
José Eduardo H. da Silva, Heder S. Bernardino, Itamar L. de Oliveira, José J. Camata
{"title":"使用scRNA-Seq数据建模、推断和评估基因调控网络的方法学过程的调查","authors":"José Eduardo H. da Silva,&nbsp;Heder S. Bernardino,&nbsp;Itamar L. de Oliveira,&nbsp;José J. Camata","doi":"10.1016/j.biosystems.2025.105464","DOIUrl":null,"url":null,"abstract":"<div><div>The advent of scRNA-Seq sequencing technology has provided unprecedented resolutions in the analysis of gene regulatory networks (GRNs) at the single-cell level. However, new technical and methodological challenges also emerged. Factors such as the large number of zeros reported in expression levels, the biological variation due to the stochastic nature of gene expression, environmental niche, and effects created by the cell cycle make it difficult to correctly interpret the data obtained in the sequencing stage. On the other hand, the development of methods for the inference of GRNs, specifically using scRNA-Seq technology, proved to be of similar quality to random predictors. The lack of adequate pre-processing of gene expression data, including selection steps for subsets of genes of interest, smoothing, and discretization of gene expression, in addition to the different ways of modeling networks and network motifs, are factors that affect the performance of inference approaches. Finally, the lack of knowledge about the ground-truth network and the non-standardization of appropriate metrics to measure the quality of inferred networks make the process of comparing performance between algorithms a major problem, given the unbalanced nature of the data and the interpretation bias caused by the chosen metric. This article brings these issues to light, aiming to show how these factors influence both the inference process and the performance evaluation of inferred networks, through comparative computational experiments and provides suggestions for a more robust methodological process for researchers dealing with inference of GRNs.</div></div>","PeriodicalId":50730,"journal":{"name":"Biosystems","volume":"253 ","pages":"Article 105464"},"PeriodicalIF":2.0000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A survey of the methodological process of modeling, inference, and evaluation of gene regulatory networks using scRNA-Seq data\",\"authors\":\"José Eduardo H. da Silva,&nbsp;Heder S. Bernardino,&nbsp;Itamar L. de Oliveira,&nbsp;José J. Camata\",\"doi\":\"10.1016/j.biosystems.2025.105464\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The advent of scRNA-Seq sequencing technology has provided unprecedented resolutions in the analysis of gene regulatory networks (GRNs) at the single-cell level. However, new technical and methodological challenges also emerged. Factors such as the large number of zeros reported in expression levels, the biological variation due to the stochastic nature of gene expression, environmental niche, and effects created by the cell cycle make it difficult to correctly interpret the data obtained in the sequencing stage. On the other hand, the development of methods for the inference of GRNs, specifically using scRNA-Seq technology, proved to be of similar quality to random predictors. The lack of adequate pre-processing of gene expression data, including selection steps for subsets of genes of interest, smoothing, and discretization of gene expression, in addition to the different ways of modeling networks and network motifs, are factors that affect the performance of inference approaches. Finally, the lack of knowledge about the ground-truth network and the non-standardization of appropriate metrics to measure the quality of inferred networks make the process of comparing performance between algorithms a major problem, given the unbalanced nature of the data and the interpretation bias caused by the chosen metric. This article brings these issues to light, aiming to show how these factors influence both the inference process and the performance evaluation of inferred networks, through comparative computational experiments and provides suggestions for a more robust methodological process for researchers dealing with inference of GRNs.</div></div>\",\"PeriodicalId\":50730,\"journal\":{\"name\":\"Biosystems\",\"volume\":\"253 \",\"pages\":\"Article 105464\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biosystems\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0303264725000747\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biosystems","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0303264725000747","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

scRNA-Seq测序技术的出现为单细胞水平的基因调控网络(grn)分析提供了前所未有的解决方案。然而,新的技术和方法挑战也出现了。诸如表达水平中大量的零、基因表达的随机性、环境生态位和细胞周期产生的影响等因素使得正确解释测序阶段获得的数据变得困难。另一方面,grn推断方法的发展,特别是使用scRNA-Seq技术,被证明具有与随机预测器相似的质量。缺乏充分的基因表达数据预处理,包括对感兴趣的基因子集的选择步骤、平滑和基因表达的离散化,以及建模网络和网络基序的不同方式,都是影响推理方法性能的因素。最后,考虑到数据的不平衡性质和所选指标引起的解释偏差,缺乏对真实网络的了解和衡量推断网络质量的适当指标的非标准化使得算法之间性能的比较过程成为一个主要问题。本文揭示了这些问题,旨在通过比较计算实验展示这些因素如何影响推理过程和推断网络的性能评估,并为处理grn推理的研究人员提供更稳健的方法过程建议。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A survey of the methodological process of modeling, inference, and evaluation of gene regulatory networks using scRNA-Seq data
The advent of scRNA-Seq sequencing technology has provided unprecedented resolutions in the analysis of gene regulatory networks (GRNs) at the single-cell level. However, new technical and methodological challenges also emerged. Factors such as the large number of zeros reported in expression levels, the biological variation due to the stochastic nature of gene expression, environmental niche, and effects created by the cell cycle make it difficult to correctly interpret the data obtained in the sequencing stage. On the other hand, the development of methods for the inference of GRNs, specifically using scRNA-Seq technology, proved to be of similar quality to random predictors. The lack of adequate pre-processing of gene expression data, including selection steps for subsets of genes of interest, smoothing, and discretization of gene expression, in addition to the different ways of modeling networks and network motifs, are factors that affect the performance of inference approaches. Finally, the lack of knowledge about the ground-truth network and the non-standardization of appropriate metrics to measure the quality of inferred networks make the process of comparing performance between algorithms a major problem, given the unbalanced nature of the data and the interpretation bias caused by the chosen metric. This article brings these issues to light, aiming to show how these factors influence both the inference process and the performance evaluation of inferred networks, through comparative computational experiments and provides suggestions for a more robust methodological process for researchers dealing with inference of GRNs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Biosystems
Biosystems 生物-生物学
CiteScore
3.70
自引率
18.80%
发文量
129
审稿时长
34 days
期刊介绍: BioSystems encourages experimental, computational, and theoretical articles that link biology, evolutionary thinking, and the information processing sciences. The link areas form a circle that encompasses the fundamental nature of biological information processing, computational modeling of complex biological systems, evolutionary models of computation, the application of biological principles to the design of novel computing systems, and the use of biomolecular materials to synthesize artificial systems that capture essential principles of natural biological information processing.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信