Julie-Alexia Dias, Tony Chen, Hua Xing, Xiaoyu Wang, Alex A Rodriguez, Ravi K Madduri, Peter Kraft, Haoyu Zhang
{"title":"评估多祖先全基因组关联方法:统计能力、人口结构和实际意义。","authors":"Julie-Alexia Dias, Tony Chen, Hua Xing, Xiaoyu Wang, Alex A Rodriguez, Ravi K Madduri, Peter Kraft, Haoyu Zhang","doi":"10.1016/j.ajhg.2025.08.006","DOIUrl":null,"url":null,"abstract":"<p><p>The increasing availability of diverse biobanks has enabled multi-ancestry genome-wide association studies (GWASs) to enhance the discovery of genetic variants across traits and diseases. However, the choice of an optimal method remains debated, due to challenges in statistical power differences across ancestral groups and approaches to account for population structure. Two primary strategies exist: (1) pooled analysis, which combines individuals from all genetic backgrounds into a single dataset while adjusting for population stratification using principal components, increasing the sample size and statistical power but requiring careful control of population stratification; and (2) meta-analysis, which performs ancestry-group-specific GWASs and subsequently combines summary statistics, potentially capturing fine-scale population structure but facing limitations in handling admixed individuals. Using large-scale simulations with varying sample sizes and ancestry compositions, we compare these methods alongside real data analyses of eight continuous and five binary traits from the UK Biobank (N ≈ 324,000) and the All of Us Research Program (N ≈ 207,000). Our results demonstrate that pooled analysis generally exhibits better statistical power while effectively adjusting for population stratification. We further present a theoretical framework linking power differences to allele-frequency variations across populations. These findings, validated across both biobanks, highlight pooled analysis as a powerful and scalable strategy for multi-ancestry GWASs, improving genetic discovery while maintaining rigorous population structure control.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2493-2508"},"PeriodicalIF":8.1000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416762/pdf/","citationCount":"0","resultStr":"{\"title\":\"Evaluating multi-ancestry genome-wide association methods: Statistical power, population structure, and practical implications.\",\"authors\":\"Julie-Alexia Dias, Tony Chen, Hua Xing, Xiaoyu Wang, Alex A Rodriguez, Ravi K Madduri, Peter Kraft, Haoyu Zhang\",\"doi\":\"10.1016/j.ajhg.2025.08.006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The increasing availability of diverse biobanks has enabled multi-ancestry genome-wide association studies (GWASs) to enhance the discovery of genetic variants across traits and diseases. However, the choice of an optimal method remains debated, due to challenges in statistical power differences across ancestral groups and approaches to account for population structure. Two primary strategies exist: (1) pooled analysis, which combines individuals from all genetic backgrounds into a single dataset while adjusting for population stratification using principal components, increasing the sample size and statistical power but requiring careful control of population stratification; and (2) meta-analysis, which performs ancestry-group-specific GWASs and subsequently combines summary statistics, potentially capturing fine-scale population structure but facing limitations in handling admixed individuals. Using large-scale simulations with varying sample sizes and ancestry compositions, we compare these methods alongside real data analyses of eight continuous and five binary traits from the UK Biobank (N ≈ 324,000) and the All of Us Research Program (N ≈ 207,000). Our results demonstrate that pooled analysis generally exhibits better statistical power while effectively adjusting for population stratification. We further present a theoretical framework linking power differences to allele-frequency variations across populations. These findings, validated across both biobanks, highlight pooled analysis as a powerful and scalable strategy for multi-ancestry GWASs, improving genetic discovery while maintaining rigorous population structure control.</p>\",\"PeriodicalId\":7659,\"journal\":{\"name\":\"American journal of human genetics\",\"volume\":\" \",\"pages\":\"2493-2508\"},\"PeriodicalIF\":8.1000,\"publicationDate\":\"2025-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416762/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American journal of human genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1016/j.ajhg.2025.08.006\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/9/2 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of human genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.ajhg.2025.08.006","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/2 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
摘要
越来越多的生物库的可用性使得多祖先全基因组关联研究(GWASs)能够加强对性状和疾病遗传变异的发现。然而,由于不同祖先群体之间的统计能力差异和考虑人口结构的方法存在挑战,对最佳方法的选择仍然存在争议。存在两种主要策略:(1)混合分析,将所有遗传背景的个体合并为一个数据集,同时使用主成分调整种群分层,增加样本量和统计能力,但需要仔细控制种群分层;(2)荟萃分析(meta-analysis),执行特定于祖先群体的GWASs,随后结合汇总统计,可能捕获精细尺度的种群结构,但在处理混合个体方面存在局限性。利用不同样本量和祖先组成的大规模模拟,我们将这些方法与来自UK Biobank (N≈324,000)和All of Us Research Program (N≈207,000)的8个连续特征和5个二元特征的真实数据分析进行了比较。我们的研究结果表明,在有效调整人口分层的同时,合并分析总体上显示出更好的统计能力。我们进一步提出了一个理论框架,将权力差异与人群中等位基因频率的变化联系起来。这些发现在两个生物库中得到了验证,强调了集合分析作为多祖先GWASs的强大且可扩展的策略,可以在保持严格的种群结构控制的同时改善遗传发现。
Evaluating multi-ancestry genome-wide association methods: Statistical power, population structure, and practical implications.
The increasing availability of diverse biobanks has enabled multi-ancestry genome-wide association studies (GWASs) to enhance the discovery of genetic variants across traits and diseases. However, the choice of an optimal method remains debated, due to challenges in statistical power differences across ancestral groups and approaches to account for population structure. Two primary strategies exist: (1) pooled analysis, which combines individuals from all genetic backgrounds into a single dataset while adjusting for population stratification using principal components, increasing the sample size and statistical power but requiring careful control of population stratification; and (2) meta-analysis, which performs ancestry-group-specific GWASs and subsequently combines summary statistics, potentially capturing fine-scale population structure but facing limitations in handling admixed individuals. Using large-scale simulations with varying sample sizes and ancestry compositions, we compare these methods alongside real data analyses of eight continuous and five binary traits from the UK Biobank (N ≈ 324,000) and the All of Us Research Program (N ≈ 207,000). Our results demonstrate that pooled analysis generally exhibits better statistical power while effectively adjusting for population stratification. We further present a theoretical framework linking power differences to allele-frequency variations across populations. These findings, validated across both biobanks, highlight pooled analysis as a powerful and scalable strategy for multi-ancestry GWASs, improving genetic discovery while maintaining rigorous population structure control.
期刊介绍:
The American Journal of Human Genetics (AJHG) is a monthly journal published by Cell Press, chosen by The American Society of Human Genetics (ASHG) as its premier publication starting from January 2008. AJHG represents Cell Press's first society-owned journal, and both ASHG and Cell Press anticipate significant synergies between AJHG content and that of other Cell Press titles.