Using Alternative Definitions of Controls to Increase Statistical Power in GWAS.

IF 2.6 4区 医学 Q2 BEHAVIORAL SCIENCES
Behavior Genetics Pub Date : 2024-07-01 Epub Date: 2024-06-13 DOI:10.1007/s10519-024-10187-w
Sarah E Benstock, Katherine Weaver, John M Hettema, Brad Verhulst
{"title":"Using Alternative Definitions of Controls to Increase Statistical Power in GWAS.","authors":"Sarah E Benstock, Katherine Weaver, John M Hettema, Brad Verhulst","doi":"10.1007/s10519-024-10187-w","DOIUrl":null,"url":null,"abstract":"<p><p>Genome-wide association studies (GWAS) are often underpowered due to small effect sizes of common single nucleotide polymorphisms (SNPs) on phenotypes and extreme multiple testing thresholds. The most common approach for increasing statistical power is to increase sample size. We propose an alternative strategy of redefining case-control outcomes into ordinal case-subthreshold-asymptomatic variables. While maintaining the clinical case threshold, we subdivide controls into two groups: individuals who are symptomatic but do not meet the clinical criteria for diagnosis (subthreshold) and individuals who are effectively asymptomatic. We conducted a simulation study to examine the impact of effect size, minor allele frequency, population prevalence, and the prevalence of the subthreshold group on statistical power to detect genetic associations in three scenarios: a standard case-control, an ordinal, and a case-asymptomatic control analysis. Our results suggest the ordinal model consistently provides the greatest statistical power while the case-control model the least. Power in the case-asymptomatic control model reflects the case-control or ordinal model depending on the population prevalence and size of the subthreshold category. We then analyzed a major depression phenotype from the UK Biobank to corroborate our simulation results. Overall, the ordinal model improves statistical power in GWAS consistent with increasing the sample size by approximately 10%.</p>","PeriodicalId":8715,"journal":{"name":"Behavior Genetics","volume":" ","pages":"353-366"},"PeriodicalIF":2.6000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior Genetics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10519-024-10187-w","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/13 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BEHAVIORAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Genome-wide association studies (GWAS) are often underpowered due to small effect sizes of common single nucleotide polymorphisms (SNPs) on phenotypes and extreme multiple testing thresholds. The most common approach for increasing statistical power is to increase sample size. We propose an alternative strategy of redefining case-control outcomes into ordinal case-subthreshold-asymptomatic variables. While maintaining the clinical case threshold, we subdivide controls into two groups: individuals who are symptomatic but do not meet the clinical criteria for diagnosis (subthreshold) and individuals who are effectively asymptomatic. We conducted a simulation study to examine the impact of effect size, minor allele frequency, population prevalence, and the prevalence of the subthreshold group on statistical power to detect genetic associations in three scenarios: a standard case-control, an ordinal, and a case-asymptomatic control analysis. Our results suggest the ordinal model consistently provides the greatest statistical power while the case-control model the least. Power in the case-asymptomatic control model reflects the case-control or ordinal model depending on the population prevalence and size of the subthreshold category. We then analyzed a major depression phenotype from the UK Biobank to corroborate our simulation results. Overall, the ordinal model improves statistical power in GWAS consistent with increasing the sample size by approximately 10%.

Abstract Image

使用对照组的替代定义来提高 GWAS 的统计功率。
由于常见的单核苷酸多态性(SNPs)对表型的影响较小,而且多重测试阈值极高,因此全基因组关联研究(GWAS)的统计能力往往不足。提高统计能力的最常用方法是增加样本量。我们提出了另一种策略,即把病例对照结果重新定义为序数病例-次阈值-无症状变量。在保持临床病例阈值的同时,我们将对照组细分为两组:有症状但不符合临床诊断标准的个体(阈值以下)和实际无症状的个体。我们进行了一项模拟研究,在标准病例对照、序数对照和病例-无症状对照分析三种情况下,考察效应大小、小等位基因频率、人群患病率和阈值下群体患病率对检测遗传关联的统计能力的影响。我们的结果表明,序数模型的统计能力最大,而病例对照模型的统计能力最小。病例-无症状对照模型的统计能力反映了病例-对照模型还是序数模型,这取决于人群患病率和亚阈值类别的大小。我们随后分析了英国生物库中的重度抑郁症表型,以证实我们的模拟结果。总的来说,在将样本量增加约 10%的情况下,序数模型提高了 GWAS 的统计能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Behavior Genetics
Behavior Genetics 生物-行为科学
CiteScore
4.90
自引率
7.70%
发文量
30
审稿时长
6-12 weeks
期刊介绍: Behavior Genetics - the leading journal concerned with the genetic analysis of complex traits - is published in cooperation with the Behavior Genetics Association. This timely journal disseminates the most current original research on the inheritance and evolution of behavioral characteristics in man and other species. Contributions from eminent international researchers focus on both the application of various genetic perspectives to the study of behavioral characteristics and the influence of behavioral differences on the genetic structure of populations.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信