基于树的差分测试，使用RNA-seq的推理不确定性

IF 5.5 2区生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY

Genome research Pub Date : 2025-08-21 DOI:10.1101/gr.279981.124

Noor P Singh, Euphy Wu, Jason Fan, Michael I Love, Rob Patro

{"title":"基于树的差分测试，使用RNA-seq的推理不确定性","authors":"Noor P Singh, Euphy Wu, Jason Fan, Michael I Love, Rob Patro","doi":"10.1101/gr.279981.124","DOIUrl":null,"url":null,"abstract":"Identifying differentially expressed transcripts poses a crucial yet challenging problem in transcriptomic. Substantial uncertainty is associated with the abundance estimates of certain transcripts which, if ignored, can lead to the exaggeration of false positives and, if included, may lead to reduced power. Given a set of RNA-seq samples, TreeTerminus arranges transcripts in a hierarchical tree structure that encodes different layers of resolution for interpretation of the abundance of transcriptional groups, with uncertainty generally decreasing as one ascends the tree from the leaves. We introduce mehenDi, which utilizes the tree structure from TreeTerminus for differential testing. The nodes output by mehenDi, called the selected nodes are determined in a data-driven manner to maximize the signal that can be extracted from the data while controlling for the uncertainty associated with estimating the transcript abundances. The identified selected nodes can include transcripts and inner nodes, with no two nodes having an ancestor/descendant relationship. We evaluated our method on both simulated and experimental datasets, comparing its performance with other tree-based differential methods as well as with uncertainty-aware differential transcript/gene expression methods. Our method detects inner nodes that show a strong signal for differential expression, which would have been overlooked when analyzing the transcripts alone.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"9 1","pages":""},"PeriodicalIF":5.5000,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tree-based differential testing using inferential uncertainty for RNA-seq\",\"authors\":\"Noor P Singh, Euphy Wu, Jason Fan, Michael I Love, Rob Patro\",\"doi\":\"10.1101/gr.279981.124\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Identifying differentially expressed transcripts poses a crucial yet challenging problem in transcriptomic. Substantial uncertainty is associated with the abundance estimates of certain transcripts which, if ignored, can lead to the exaggeration of false positives and, if included, may lead to reduced power. Given a set of RNA-seq samples, TreeTerminus arranges transcripts in a hierarchical tree structure that encodes different layers of resolution for interpretation of the abundance of transcriptional groups, with uncertainty generally decreasing as one ascends the tree from the leaves. We introduce mehenDi, which utilizes the tree structure from TreeTerminus for differential testing. The nodes output by mehenDi, called the selected nodes are determined in a data-driven manner to maximize the signal that can be extracted from the data while controlling for the uncertainty associated with estimating the transcript abundances. The identified selected nodes can include transcripts and inner nodes, with no two nodes having an ancestor/descendant relationship. We evaluated our method on both simulated and experimental datasets, comparing its performance with other tree-based differential methods as well as with uncertainty-aware differential transcript/gene expression methods. Our method detects inner nodes that show a strong signal for differential expression, which would have been overlooked when analyzing the transcripts alone.\",\"PeriodicalId\":12678,\"journal\":{\"name\":\"Genome research\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genome research\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1101/gr.279981.124\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1101/gr.279981.124","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

鉴定差异表达转录本是转录组学研究中一个重要而又具有挑战性的问题。实质性的不确定性与某些转录本的丰度估计有关，如果忽视，可能导致假阳性的夸大，如果包括在内，可能导致功率降低。给定一组RNA-seq样本，TreeTerminus将转录本排列成分层树状结构，编码不同层次的分辨率，以解释转录组的丰度，随着人们从叶子上升到树状结构，不确定性通常会降低。我们介绍了mehenDi，它利用TreeTerminus的树形结构进行差分测试。mehenDi输出的节点（称为选定节点）以数据驱动的方式确定，以最大限度地从数据中提取信号，同时控制与估计转录本丰度相关的不确定性。确定的选定节点可以包括转录本和内部节点，没有两个节点具有祖先/后代关系。我们在模拟和实验数据集上评估了我们的方法，并将其与其他基于树的差异方法以及不确定性感知的差异转录物/基因表达方法进行了比较。我们的方法检测内部节点，这些节点显示出强烈的差异表达信号，这在单独分析转录本时可能会被忽略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Tree-based differential testing using inferential uncertainty for RNA-seq

Identifying differentially expressed transcripts poses a crucial yet challenging problem in transcriptomic. Substantial uncertainty is associated with the abundance estimates of certain transcripts which, if ignored, can lead to the exaggeration of false positives and, if included, may lead to reduced power. Given a set of RNA-seq samples, TreeTerminus arranges transcripts in a hierarchical tree structure that encodes different layers of resolution for interpretation of the abundance of transcriptional groups, with uncertainty generally decreasing as one ascends the tree from the leaves. We introduce mehenDi, which utilizes the tree structure from TreeTerminus for differential testing. The nodes output by mehenDi, called the selected nodes are determined in a data-driven manner to maximize the signal that can be extracted from the data while controlling for the uncertainty associated with estimating the transcript abundances. The identified selected nodes can include transcripts and inner nodes, with no two nodes having an ancestor/descendant relationship. We evaluated our method on both simulated and experimental datasets, comparing its performance with other tree-based differential methods as well as with uncertainty-aware differential transcript/gene expression methods. Our method detects inner nodes that show a strong signal for differential expression, which would have been overlooked when analyzing the transcripts alone.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Genome research 生物-生化与分子生物学

CiteScore

12.40

自引率

1.40%

发文量

140

审稿时长

6 months

期刊介绍： Launched in 1995, Genome Research is an international, continuously published, peer-reviewed journal that focuses on research that provides novel insights into the genome biology of all organisms, including advances in genomic medicine. Among the topics considered by the journal are genome structure and function, comparative genomics, molecular evolution, genome-scale quantitative and population genetics, proteomics, epigenomics, and systems biology. The journal also features exciting gene discoveries and reports of cutting-edge computational biology and high-throughput methodologies. New data in these areas are published as research papers, or methods and resource reports that provide novel information on technologies or tools that will be of interest to a broad readership. Complete data sets are presented electronically on the journal''s web site where appropriate. The journal also provides Reviews, Perspectives, and Insight/Outlook articles, which present commentary on the latest advances published both here and elsewhere, placing such progress in its broader biological context.