Genome-wide multimediator analyses using the generalized Berk-Jones statistics with the composite test.

IF 4.4 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
En-Yu Lai, Yen-Tsung Huang
{"title":"Genome-wide multimediator analyses using the generalized Berk-Jones statistics with the composite test.","authors":"En-Yu Lai,&nbsp;Yen-Tsung Huang","doi":"10.1093/bioinformatics/btad544","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Mediation analysis is performed to evaluate the effects of a hypothetical causal mechanism that marks the progression from an exposure, through mediators, to an outcome. In the age of high-throughput technologies, it has become routine to assess numerous potential mechanisms at the genome or proteome scales. Alongside this, the necessity to address issues related to multiple testing has also arisen. In a sparse scenario where only a few genes or proteins are causally involved, conventional methods for assessing mediation effects lose statistical power because the composite null distribution behind this experiment cannot be attained. The power loss hence decreases the true mechanisms identified after multiple testing corrections. To fairly delineate a uniform distribution under the composite null, Huang (Genome-wide analyses of sparse mediation effects under composite null hypotheses. Ann Appl Stat 2019a;13:60-84; AoAS) proposed the composite test to provide adjusted P-values for single-mediator analyses.</p><p><strong>Results: </strong>Our contribution is to extend the method to multimediator analyses, which are commonly encountered in genomic studies and also flexible to various biological interests. Using the generalized Berk-Jones statistics with the composite test, we proposed a multivariate approach that favors dense and diverse mediation effects, a decorrelation approach that favors sparse and consistent effects, and a hybrid approach that captures the edges of both approaches. Our analysis suite has been implemented as an R package MACtest. The utility is demonstrated by analyzing the lung adenocarcinoma datasets from The Cancer Genome Atlas and Clinical Proteomic Tumor Analysis Consortium. We further investigate the genes and networks whose expression may be regulated by smoking-induced epigenetic aberrations.</p><p><strong>Availability and implementation: </strong>An R package MACtest is available on https://github.com/roqe/MACtest.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 9","pages":""},"PeriodicalIF":4.4000,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10500087/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btad544","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Mediation analysis is performed to evaluate the effects of a hypothetical causal mechanism that marks the progression from an exposure, through mediators, to an outcome. In the age of high-throughput technologies, it has become routine to assess numerous potential mechanisms at the genome or proteome scales. Alongside this, the necessity to address issues related to multiple testing has also arisen. In a sparse scenario where only a few genes or proteins are causally involved, conventional methods for assessing mediation effects lose statistical power because the composite null distribution behind this experiment cannot be attained. The power loss hence decreases the true mechanisms identified after multiple testing corrections. To fairly delineate a uniform distribution under the composite null, Huang (Genome-wide analyses of sparse mediation effects under composite null hypotheses. Ann Appl Stat 2019a;13:60-84; AoAS) proposed the composite test to provide adjusted P-values for single-mediator analyses.

Results: Our contribution is to extend the method to multimediator analyses, which are commonly encountered in genomic studies and also flexible to various biological interests. Using the generalized Berk-Jones statistics with the composite test, we proposed a multivariate approach that favors dense and diverse mediation effects, a decorrelation approach that favors sparse and consistent effects, and a hybrid approach that captures the edges of both approaches. Our analysis suite has been implemented as an R package MACtest. The utility is demonstrated by analyzing the lung adenocarcinoma datasets from The Cancer Genome Atlas and Clinical Proteomic Tumor Analysis Consortium. We further investigate the genes and networks whose expression may be regulated by smoking-induced epigenetic aberrations.

Availability and implementation: An R package MACtest is available on https://github.com/roqe/MACtest.

Abstract Image

Abstract Image

Abstract Image

全基因组多介质分析使用广义伯克-琼斯统计与复合检验。
动机:进行中介分析是为了评估一个假设的因果机制的影响,该机制标志着从暴露,通过中介,到结果的进展。在高通量技术的时代,在基因组或蛋白质组尺度上评估许多潜在的机制已经成为常规。除此之外,解决与多重测试相关的问题的必要性也出现了。在只有少数基因或蛋白质参与的稀疏情况下,评估中介效应的传统方法失去了统计能力,因为无法获得该实验背后的复合零分布。因此,功率损耗降低了经过多次测试修正后确定的真实机制。为了公平地描述在复合零假设下的均匀分布,Huang (Genome-wide)分析了在复合零假设下的稀疏中介效应。Ann apple Stat 2019;13:60-84;AoAS)提出了复合检验,为单介质分析提供调整后的p值。结果:我们的贡献是将方法扩展到多介质分析,这在基因组研究中经常遇到,并且对各种生物学兴趣也很灵活。利用广义Berk-Jones统计和复合检验,我们提出了一种有利于密集和多样化中介效应的多元方法,一种有利于稀疏和一致效应的去相关方法,以及一种捕捉两种方法边缘的混合方法。我们的分析套件已经被实现为一个R包MACtest。通过分析来自癌症基因组图谱和临床蛋白质组学肿瘤分析联盟的肺腺癌数据集,证明了其实用性。我们进一步研究了可能受吸烟诱导的表观遗传畸变调控的基因和网络。可用性和实现:在https://github.com/roqe/MACtest上可以获得R包MACtest。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Bioinformatics
Bioinformatics 生物-生化研究方法
CiteScore
11.20
自引率
5.20%
发文量
753
审稿时长
2.1 months
期刊介绍: The leading journal in its field, Bioinformatics publishes the highest quality scientific papers and review articles of interest to academic and industrial researchers. Its main focus is on new developments in genome bioinformatics and computational biology. Two distinct sections within the journal - Discovery Notes and Application Notes- focus on shorter papers; the former reporting biologically interesting discoveries using computational methods, the latter exploring the applications used for experiments.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信