Comparison of methods for building polygenic scores for diverse populations.

IF 3.3 Q2 GENETICS & HEREDITY

HGG Advances Pub Date : 2025-01-09 Epub Date: 2024-09-25 DOI:10.1016/j.xhgg.2024.100355

Sophia Gunn, Xin Wang, Daniel C Posner, Kelly Cho, Jennifer E Huffman, Michael Gaziano, Peter W Wilson, Yan V Sun, Gina Peloso, Kathryn L Lunetta

{"title":"Comparison of methods for building polygenic scores for diverse populations.","authors":"Sophia Gunn, Xin Wang, Daniel C Posner, Kelly Cho, Jennifer E Huffman, Michael Gaziano, Peter W Wilson, Yan V Sun, Gina Peloso, Kathryn L Lunetta","doi":"10.1016/j.xhgg.2024.100355","DOIUrl":null,"url":null,"abstract":"<p><p>Polygenic scores (PGSs) are a promising tool for estimating individual-level genetic risk of disease based on the results of genome-wide association studies (GWASs). However, their promise has yet to be fully realized because most currently available PGSs were built with genetic data from predominantly European-ancestry populations, and PGS performance declines when scores are applied to target populations different from the populations from which they were derived. Thus, there is a great need to improve PGS performance in currently under-studied populations. In this work we leverage data from two large and diverse cohorts the Million Veterans Program (MVP) and All of Us (AoU), providing us the unique opportunity to compare methods for building PGSs for multi-ancestry populations across multiple traits. We build PGSs for five continuous traits and five binary traits using both multi-ancestry and single-ancestry approaches with popular Bayesian PGS methods and both MVP META GWAS results and population-specific GWAS results from the respective African, European, and Hispanic MVP populations. We evaluate these scores in three AoU populations genetically similar to the respective African, Admixed American, and European 1000 Genomes Project superpopulations. Using correlation-based tests, we make formal comparisons of the PGS performance across the multiple AoU populations. We conclude that approaches that combine GWAS data from multiple populations produce PGSs that perform better than approaches that utilize smaller single-population GWAS results matched to the target population, and specifically that multi-ancestry scores built with PRS-CSx outperform the other approaches in the three AoU populations.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100355"},"PeriodicalIF":3.3000,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11532986/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"HGG Advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.xhgg.2024.100355","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/25 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}

引用次数: 0

Abstract

Polygenic scores (PGSs) are a promising tool for estimating individual-level genetic risk of disease based on the results of genome-wide association studies (GWASs). However, their promise has yet to be fully realized because most currently available PGSs were built with genetic data from predominantly European-ancestry populations, and PGS performance declines when scores are applied to target populations different from the populations from which they were derived. Thus, there is a great need to improve PGS performance in currently under-studied populations. In this work we leverage data from two large and diverse cohorts the Million Veterans Program (MVP) and All of Us (AoU), providing us the unique opportunity to compare methods for building PGSs for multi-ancestry populations across multiple traits. We build PGSs for five continuous traits and five binary traits using both multi-ancestry and single-ancestry approaches with popular Bayesian PGS methods and both MVP META GWAS results and population-specific GWAS results from the respective African, European, and Hispanic MVP populations. We evaluate these scores in three AoU populations genetically similar to the respective African, Admixed American, and European 1000 Genomes Project superpopulations. Using correlation-based tests, we make formal comparisons of the PGS performance across the multiple AoU populations. We conclude that approaches that combine GWAS data from multiple populations produce PGSs that perform better than approaches that utilize smaller single-population GWAS results matched to the target population, and specifically that multi-ancestry scores built with PRS-CSx outperform the other approaches in the three AoU populations.

查看原文本刊更多论文

比较为不同人群建立多基因评分的方法。

多基因评分（PGS）是根据全基因组关联研究（GWAS）结果估算个体疾病遗传风险的一种很有前途的工具。然而，由于目前大多数可用的多基因评分都是利用主要来自欧洲血统人群的遗传数据建立的，而当评分应用于不同于其来源人群的目标人群时，多基因评分的性能就会下降，因此多基因评分的前景尚未完全实现。因此，亟需提高目前研究不足人群的 PGS 性能。在这项工作中，我们利用了来自 "百万退伍军人计划"（Million Veterans Program，MVP）和 "我们所有人"（All of Us，AoU）这两个大型、多样化队列的数据，从而为我们提供了一个独特的机会，让我们可以比较在多个性状上为多世系人群构建多基因分数的方法。我们利用流行的贝叶斯 PGS 方法和来自非洲、欧洲和西班牙裔 MVP 群体的特定人群 GWAS 结果，采用单种系和多种系方法为五个连续性状和五个二元性状建立了多基因分数。我们在与非洲、美洲和欧洲千基因组计划超级种群基因相似的三个 AoU 种群中评估了这些得分。通过基于相关性的测试，我们对多个 AoU 群体的 PGS 性能进行了正式比较。我们得出的结论是，与利用与目标人群相匹配的较小的单人群 GWAS 结果的方法相比，结合来自多个人群的 GWAS 数据的方法产生的 PGS 性能更好，特别是在三个 AoU 人群中，利用 PRS-CSx 建立的多家系分数优于其他方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊