Mathematical bounds on r2 and the effect size in case-control genome-wide association studies

IF 1.2 4区 生物学 Q4 ECOLOGY
Sanjana M. Paye , Michael D. Edge
{"title":"Mathematical bounds on r2 and the effect size in case-control genome-wide association studies","authors":"Sanjana M. Paye ,&nbsp;Michael D. Edge","doi":"10.1016/j.tpb.2025.04.003","DOIUrl":null,"url":null,"abstract":"<div><div>Case-control genome-wide association studies (GWAS) are often used to find associations between genetic variants and diseases. When case-control GWAS are conducted, researchers must make decisions regarding how many cases and how many controls to include in the study. Connections between variants and diseases are made using association statistics, including <span><math><msup><mrow><mi>χ</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>. Previous work in population genetics has shown that LD statistics, including <span><math><msup><mrow><mi>r</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>, are bounded by the allele frequencies in the population being studied. Since varying the case fraction changes sample allele frequencies, we use the known bounds on <span><math><msup><mrow><mi>r</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> to explore how the fraction of cases included in a study can affect statistical power to detect associations. We analyze a simple mathematical model and use simulations to study a quantity proportional to the <span><math><msup><mrow><mi>χ</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> noncentrality parameter, which is closely related to <span><math><msup><mrow><mi>r</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>, under various conditions. Varying the case fraction changes the <span><math><msup><mrow><mi>χ</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> noncentrality parameter, and by extension the statistical power, with effects depending on the dominance, penetrance, and frequency of the risk allele. Our framework explains previously observed results, such as asymmetries in power to detect risk vs. protective alleles, and the fact that a balanced sample of cases and controls does not always give the best power to detect associations, particularly for highly penetrant minor risk alleles that are either dominant or recessive. We show by simulation that our results can be used as a rough guide to statistical power for association tests other than <span><math><msup><mrow><mi>χ</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> tests of independence.</div></div>","PeriodicalId":49437,"journal":{"name":"Theoretical Population Biology","volume":"164 ","pages":"Pages 1-11"},"PeriodicalIF":1.2000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theoretical Population Biology","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0040580925000280","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Case-control genome-wide association studies (GWAS) are often used to find associations between genetic variants and diseases. When case-control GWAS are conducted, researchers must make decisions regarding how many cases and how many controls to include in the study. Connections between variants and diseases are made using association statistics, including χ2. Previous work in population genetics has shown that LD statistics, including r2, are bounded by the allele frequencies in the population being studied. Since varying the case fraction changes sample allele frequencies, we use the known bounds on r2 to explore how the fraction of cases included in a study can affect statistical power to detect associations. We analyze a simple mathematical model and use simulations to study a quantity proportional to the χ2 noncentrality parameter, which is closely related to r2, under various conditions. Varying the case fraction changes the χ2 noncentrality parameter, and by extension the statistical power, with effects depending on the dominance, penetrance, and frequency of the risk allele. Our framework explains previously observed results, such as asymmetries in power to detect risk vs. protective alleles, and the fact that a balanced sample of cases and controls does not always give the best power to detect associations, particularly for highly penetrant minor risk alleles that are either dominant or recessive. We show by simulation that our results can be used as a rough guide to statistical power for association tests other than χ2 tests of independence.
病例对照全基因组关联研究中r2的数学界限和效应大小
病例对照全基因组关联研究(GWAS)常用于发现遗传变异与疾病之间的关联。当进行病例对照GWAS时,研究人员必须决定在研究中纳入多少病例和多少对照。变异和疾病之间的联系使用关联统计,包括χ2。先前的群体遗传学研究表明,包括r2在内的LD统计数据受到被研究群体中等位基因频率的限制。由于改变病例比例会改变样本等位基因频率,因此我们使用r2上的已知界限来探索研究中包含的病例比例如何影响检测关联的统计能力。我们分析了一个简单的数学模型,并通过模拟研究了在各种条件下与χ2非中心性参数成正比的数量,而χ2非中心性参数与r2密切相关。改变病例比例会改变χ2非中心性参数,进而改变统计能力,其影响取决于风险等位基因的显性、外显率和频率。我们的框架解释了先前观察到的结果,例如检测风险与保护性等位基因的能力不对称,以及平衡的病例和对照样本并不总是提供检测关联的最佳能力,特别是对于显性或隐性高渗透的小风险等位基因。我们通过模拟表明,我们的结果可以作为关联检验的统计能力的粗略指南,而不是χ2独立性检验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Theoretical Population Biology
Theoretical Population Biology 生物-进化生物学
CiteScore
2.50
自引率
14.30%
发文量
43
审稿时长
6-12 weeks
期刊介绍: An interdisciplinary journal, Theoretical Population Biology presents articles on theoretical aspects of the biology of populations, particularly in the areas of demography, ecology, epidemiology, evolution, and genetics. Emphasis is on the development of mathematical theory and models that enhance the understanding of biological phenomena. Articles highlight the motivation and significance of the work for advancing progress in biology, relying on a substantial mathematical effort to obtain biological insight. The journal also presents empirical results and computational and statistical methods directly impinging on theoretical problems in population biology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信