CorrAdjust unveils biologically relevant transcriptomic correlations by efficiently eliminating hidden confounders.

IF 16.6 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Stepan Nersisyan, Phillipe Loher, Isidore Rigoutsos
{"title":"CorrAdjust unveils biologically relevant transcriptomic correlations by efficiently eliminating hidden confounders.","authors":"Stepan Nersisyan, Phillipe Loher, Isidore Rigoutsos","doi":"10.1093/nar/gkaf444","DOIUrl":null,"url":null,"abstract":"<p><p>Correcting for confounding variables is often overlooked when computing RNA-RNA correlations, even though it can profoundly affect results. We introduce CorrAdjust, a method for identifying and correcting such hidden confounders. CorrAdjust selects a subset of principal components to residualize from expression data by maximizing the enrichment of \"reference pairs\" among highly correlated RNA-RNA pairs. Unlike traditional machine learning metrics, this novel enrichment-based metric is specifically designed to evaluate correlation data and provides valuable RNA-level interpretability. CorrAdjust outperforms current state-of-the-art methods when evaluated on 25 063 human RNA-seq datasets from The Cancer Genome Atlas, the Genotype-Tissue Expression project, and the Geuvadis collection. In particular, CorrAdjust excels at integrating small RNA and mRNA sequencing data, significantly enhancing the enrichment of experimentally validated miRNA targets among negatively correlated miRNA-mRNA pairs. CorrAdjust, with accompanying documentation and tutorials, is available at https://tju-cmc-org.github.io/CorrAdjust.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"53 10","pages":""},"PeriodicalIF":16.6000,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12125544/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nucleic Acids Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/nar/gkaf444","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Correcting for confounding variables is often overlooked when computing RNA-RNA correlations, even though it can profoundly affect results. We introduce CorrAdjust, a method for identifying and correcting such hidden confounders. CorrAdjust selects a subset of principal components to residualize from expression data by maximizing the enrichment of "reference pairs" among highly correlated RNA-RNA pairs. Unlike traditional machine learning metrics, this novel enrichment-based metric is specifically designed to evaluate correlation data and provides valuable RNA-level interpretability. CorrAdjust outperforms current state-of-the-art methods when evaluated on 25 063 human RNA-seq datasets from The Cancer Genome Atlas, the Genotype-Tissue Expression project, and the Geuvadis collection. In particular, CorrAdjust excels at integrating small RNA and mRNA sequencing data, significantly enhancing the enrichment of experimentally validated miRNA targets among negatively correlated miRNA-mRNA pairs. CorrAdjust, with accompanying documentation and tutorials, is available at https://tju-cmc-org.github.io/CorrAdjust.

CorrAdjust通过有效地消除隐藏的混杂因素揭示生物学相关的转录组相关性。
在计算RNA-RNA相关性时,校正混杂变量常常被忽略,尽管它会对结果产生深远的影响。我们介绍了CorrAdjust,一种识别和纠正这些隐藏混杂因素的方法。CorrAdjust通过在高度相关的RNA-RNA对中最大限度地富集“参考对”,从表达数据中选择一个主成分子集进行残差处理。与传统的机器学习指标不同,这种新颖的基于丰富度的指标专门用于评估相关数据,并提供有价值的rna水平的可解释性。当对来自癌症基因组图谱、基因型组织表达项目和Geuvadis收集的25063个人类RNA-seq数据集进行评估时,CorrAdjust优于目前最先进的方法。特别是,CorrAdjust擅长整合小RNA和mRNA测序数据,显著增强了负相关miRNA-mRNA对中实验验证的miRNA靶点的富集。CorrAdjust及其附带的文档和教程可在https://tju-cmc-org.github.io/CorrAdjust上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Nucleic Acids Research
Nucleic Acids Research 生物-生化与分子生物学
CiteScore
27.10
自引率
4.70%
发文量
1057
审稿时长
2 months
期刊介绍: Nucleic Acids Research (NAR) is a scientific journal that publishes research on various aspects of nucleic acids and proteins involved in nucleic acid metabolism and interactions. It covers areas such as chemistry and synthetic biology, computational biology, gene regulation, chromatin and epigenetics, genome integrity, repair and replication, genomics, molecular biology, nucleic acid enzymes, RNA, and structural biology. The journal also includes a Survey and Summary section for brief reviews. Additionally, each year, the first issue is dedicated to biological databases, and an issue in July focuses on web-based software resources for the biological community. Nucleic Acids Research is indexed by several services including Abstracts on Hygiene and Communicable Diseases, Animal Breeding Abstracts, Agricultural Engineering Abstracts, Agbiotech News and Information, BIOSIS Previews, CAB Abstracts, and EMBASE.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信