Differences in set-based tests for sparse alternatives when testing sets of outcomes compared to sets of explanatory factors in genetic association studies.

IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Ryan Sun, Andy Shi, Xihong Lin
{"title":"Differences in set-based tests for sparse alternatives when testing sets of outcomes compared to sets of explanatory factors in genetic association studies.","authors":"Ryan Sun, Andy Shi, Xihong Lin","doi":"10.1093/biostatistics/kxac036","DOIUrl":null,"url":null,"abstract":"<p><p>Set-based association tests are widely popular in genetic association settings for their ability to aggregate weak signals and reduce multiple testing burdens. In particular, a class of set-based tests including the Higher Criticism, Berk-Jones, and other statistics have recently been popularized for reaching a so-called detection boundary when signals are rare and weak. Such tests have been applied in two subtly different settings: (a) associating a genetic variant set with a single phenotype and (b) associating a single genetic variant with a phenotype set. A significant issue in practice is the choice of test, especially when deciding between innovated and generalized type methods for detection boundary tests. Conflicting guidance is present in the literature. This work describes how correlation structures generate marked differences in relative operating characteristics for settings (a) and (b). The implications for study design are significant. We also develop novel power bounds that facilitate the aforementioned calculations and allow for analysis of individual testing settings. In more concrete terms, our investigation is motivated by translational expression quantitative trait loci (eQTL) studies in lung cancer. These studies involve both testing for groups of variants associated with a single gene expression (multiple explanatory factors) and testing whether a single variant is associated with a group of gene expressions (multiple outcomes). Results are supported by a collection of simulation studies and illustrated through lung cancer eQTL examples.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10724113/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biostatistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biostatistics/kxac036","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Set-based association tests are widely popular in genetic association settings for their ability to aggregate weak signals and reduce multiple testing burdens. In particular, a class of set-based tests including the Higher Criticism, Berk-Jones, and other statistics have recently been popularized for reaching a so-called detection boundary when signals are rare and weak. Such tests have been applied in two subtly different settings: (a) associating a genetic variant set with a single phenotype and (b) associating a single genetic variant with a phenotype set. A significant issue in practice is the choice of test, especially when deciding between innovated and generalized type methods for detection boundary tests. Conflicting guidance is present in the literature. This work describes how correlation structures generate marked differences in relative operating characteristics for settings (a) and (b). The implications for study design are significant. We also develop novel power bounds that facilitate the aforementioned calculations and allow for analysis of individual testing settings. In more concrete terms, our investigation is motivated by translational expression quantitative trait loci (eQTL) studies in lung cancer. These studies involve both testing for groups of variants associated with a single gene expression (multiple explanatory factors) and testing whether a single variant is associated with a group of gene expressions (multiple outcomes). Results are supported by a collection of simulation studies and illustrated through lung cancer eQTL examples.

在遗传关联研究中,基于集合的稀疏替代品测试在测试结果集合与解释因素集合时的差异。
基于集合的关联检验因其能够聚合微弱信号并减轻多重检验负担而在遗传关联研究中广受欢迎。尤其是最近流行的一类基于集合的检验,包括高级批判、Berk-Jones 和其他统计方法,可以在信号稀少而微弱时达到所谓的检测边界。这类检验被应用于两种有细微差别的情况:(a) 将遗传变异集与单一表型联系起来;(b) 将单一遗传变异与表型集联系起来。在实践中,一个重要的问题是检验方法的选择,特别是在决定用创新型方法还是广义型方法进行检测边界检验时。文献中存在相互矛盾的指导意见。这项工作描述了相关结构如何在设置(a)和(b)中产生相对运行特征的明显差异。这对研究设计具有重要意义。我们还开发了新的功率边界,以方便上述计算,并允许对个别测试设置进行分析。更具体地说,我们的研究是受肺癌表达量性状位点(eQTL)转化研究的启发。这些研究既包括测试与单个基因表达(多个解释因素)相关的变异群,也包括测试单个变异是否与一组基因表达(多个结果)相关。结果得到了一系列模拟研究的支持,并通过肺癌 eQTL 示例进行了说明。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biostatistics
Biostatistics 生物-数学与计算生物学
CiteScore
5.10
自引率
4.80%
发文量
45
审稿时长
6-12 weeks
期刊介绍: Among the important scientific developments of the 20th century is the explosive growth in statistical reasoning and methods for application to studies of human health. Examples include developments in likelihood methods for inference, epidemiologic statistics, clinical trials, survival analysis, and statistical genetics. Substantive problems in public health and biomedical research have fueled the development of statistical methods, which in turn have improved our ability to draw valid inferences from data. The objective of Biostatistics is to advance statistical science and its application to problems of human health and disease, with the ultimate goal of advancing the public''s health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信