{"title":"Leveraging Global Genetics Resources to Enhance Polygenic Prediction Across Ancestrally Diverse Populations.","authors":"Oliver Pain","doi":"10.1016/j.xhgg.2025.100482","DOIUrl":null,"url":null,"abstract":"<p><p>Genome-wide association studies (GWAS) from multiple ancestral populations are increasingly available, offering opportunities to improve the accuracy and equity of polygenic scores (PGS). Several methods now aim to leverage multiple GWAS sources, but predictive performance and computational efficiency remain unclear, particularly when individual-level tuning data are unavailable. This study evaluates a comprehensive set of PGS methods across African (AFR), East Asian (EAS), and European (EUR) ancestries for 10 complex traits, using summary statistics from the Ugandan Genome Resource, Biobank Japan, UK Biobank, and the Million Veteran Program. Single-source PGS were derived using methods including DBSLMM, lassosum, LDpred2, MegaPRS, pT+clump, PRS-CS, QuickPRS, and SBayesRC. Multi-source approaches included PRS-CSx, TL-PRS, X-Wing, and combinations of independently optimised single-source scores. All methods were restricted to HapMap3 variants and used linkage disequilibrium reference panels matching the GWAS super population. A key contribution is a novel application of the LEOPARD method to estimate optimal linear combinations of population-specific PGS using only summary statistics. Analyses were implemented using the open-source GenoPred pipeline. In AFR and EAS populations, PGS combining ancestry-aligned and European GWAS outperformed single-source models. Linear combinations of independently optimised scores consistently outperformed current jointly optimised multi-source methods, while being substantially more computationally efficient. The LEOPARD extension offered a practical solution for tuning these combinations when only summary statistics were available, achieving performance comparable to tuning with individual-level data. These findings highlight a flexible and generalisable framework for multi-source PGS construction. The GenoPred pipeline supports more equitable, accurate, and accessible polygenic prediction.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100482"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"HGG Advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.xhgg.2025.100482","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Genome-wide association studies (GWAS) from multiple ancestral populations are increasingly available, offering opportunities to improve the accuracy and equity of polygenic scores (PGS). Several methods now aim to leverage multiple GWAS sources, but predictive performance and computational efficiency remain unclear, particularly when individual-level tuning data are unavailable. This study evaluates a comprehensive set of PGS methods across African (AFR), East Asian (EAS), and European (EUR) ancestries for 10 complex traits, using summary statistics from the Ugandan Genome Resource, Biobank Japan, UK Biobank, and the Million Veteran Program. Single-source PGS were derived using methods including DBSLMM, lassosum, LDpred2, MegaPRS, pT+clump, PRS-CS, QuickPRS, and SBayesRC. Multi-source approaches included PRS-CSx, TL-PRS, X-Wing, and combinations of independently optimised single-source scores. All methods were restricted to HapMap3 variants and used linkage disequilibrium reference panels matching the GWAS super population. A key contribution is a novel application of the LEOPARD method to estimate optimal linear combinations of population-specific PGS using only summary statistics. Analyses were implemented using the open-source GenoPred pipeline. In AFR and EAS populations, PGS combining ancestry-aligned and European GWAS outperformed single-source models. Linear combinations of independently optimised scores consistently outperformed current jointly optimised multi-source methods, while being substantially more computationally efficient. The LEOPARD extension offered a practical solution for tuning these combinations when only summary statistics were available, achieving performance comparable to tuning with individual-level data. These findings highlight a flexible and generalisable framework for multi-source PGS construction. The GenoPred pipeline supports more equitable, accurate, and accessible polygenic prediction.