Jinguo Huang, Nicole Kleman, Saonli Basu, Mark D Shriver, Arslan A Zaidi
{"title":"Interpreting SNP heritability in admixed populations.","authors":"Jinguo Huang, Nicole Kleman, Saonli Basu, Mark D Shriver, Arslan A Zaidi","doi":"10.1101/2023.08.04.551959","DOIUrl":null,"url":null,"abstract":"<p><p>SNP heritability <math> <mrow> <mfenced> <mrow><msubsup><mi>h</mi> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </mfenced> </mrow> </math> is defined as the proportion of phenotypic variance explained by genotyped SNPs and is believed to be a lower bound of heritability <math> <mrow> <mfenced> <mrow><msup><mi>h</mi> <mn>2</mn></msup> </mrow> </mfenced> </mrow> </math> , being equal to it if all causal variants are genotyped. Despite the simple intuition behind <math> <mrow><msubsup><mi>h</mi> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </math> , its interpretation and equivalence to <math> <mrow><msup><mi>h</mi> <mn>2</mn></msup> </mrow> </math> is unclear, particularly in the presence of admixture and assortative mating. Here we use analytical theory and simulations to describe the behavior of <math> <mrow><msup><mi>h</mi> <mn>2</mn></msup> </mrow> </math> and three widely used random-effect estimators of <math> <mrow><msubsup><mi>h</mi> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </math> - Genome-wide restricted maximum likelihood (GREML), Haseman-Elston (HE) regression, and LD score regression (LDSC) - in admixed populations. We show that <math> <mrow><msubsup><mi>h</mi> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </math> estimates can be biased in admixed populations, even if all causal variants are genotyped and in the absence of confounding due to shared environment. This is largely because admixture generates directional LD, which contributes to the genetic variance, and therefore to heritability. Random-effect estimators of <math> <mrow><msubsup><mi>h</mi> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </math> , because they assume that SNP effects are independent, do not capture the contribution, which can be positive or negative depending on the genetic architecture, leading to under- or over-estimates of <math> <mrow><msubsup><mi>h</mi> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </math> relative to <math> <mrow><msup><mi>h</mi> <mn>2</mn></msup> </mrow> </math> . For the same reason, estimates of local ancestry heritability <math> <mrow> <mfenced> <mrow> <msubsup><mover><mi>h</mi> <mo>^</mo></mover> <mi>γ</mi> <mn>2</mn></msubsup> </mrow> </mfenced> </mrow> </math> are also biased in the presence of directional LD. We describe this bias in <math> <mrow> <msubsup><mover><mi>h</mi> <mo>^</mo></mover> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </math> and <math> <mrow> <msubsup><mover><mi>h</mi> <mo>^</mo></mover> <mi>γ</mi> <mn>2</mn></msubsup> </mrow> </math> as a function of admixture history and the genetic architecture of the trait, clarifying their interpretation and implication for genome-wide association studies and polygenic prediction in admixed populations.</p>","PeriodicalId":72407,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10418213/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2023.08.04.551959","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
SNP heritability is defined as the proportion of phenotypic variance explained by genotyped SNPs and is believed to be a lower bound of heritability , being equal to it if all causal variants are genotyped. Despite the simple intuition behind , its interpretation and equivalence to is unclear, particularly in the presence of admixture and assortative mating. Here we use analytical theory and simulations to describe the behavior of and three widely used random-effect estimators of - Genome-wide restricted maximum likelihood (GREML), Haseman-Elston (HE) regression, and LD score regression (LDSC) - in admixed populations. We show that estimates can be biased in admixed populations, even if all causal variants are genotyped and in the absence of confounding due to shared environment. This is largely because admixture generates directional LD, which contributes to the genetic variance, and therefore to heritability. Random-effect estimators of , because they assume that SNP effects are independent, do not capture the contribution, which can be positive or negative depending on the genetic architecture, leading to under- or over-estimates of relative to . For the same reason, estimates of local ancestry heritability are also biased in the presence of directional LD. We describe this bias in and as a function of admixture history and the genetic architecture of the trait, clarifying their interpretation and implication for genome-wide association studies and polygenic prediction in admixed populations.