Interpreting SNP heritability in admixed populations.

Jinguo Huang, Nicole Kleman, Saonli Basu, Mark D Shriver, Arslan A Zaidi
{"title":"Interpreting SNP heritability in admixed populations.","authors":"Jinguo Huang, Nicole Kleman, Saonli Basu, Mark D Shriver, Arslan A Zaidi","doi":"10.1101/2023.08.04.551959","DOIUrl":null,"url":null,"abstract":"<p><p>SNP heritability <math> <mrow> <mfenced> <mrow><msubsup><mi>h</mi> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </mfenced> </mrow> </math> is defined as the proportion of phenotypic variance explained by genotyped SNPs and is believed to be a lower bound of heritability <math> <mrow> <mfenced> <mrow><msup><mi>h</mi> <mn>2</mn></msup> </mrow> </mfenced> </mrow> </math> , being equal to it if all causal variants are genotyped. Despite the simple intuition behind <math> <mrow><msubsup><mi>h</mi> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </math> , its interpretation and equivalence to <math> <mrow><msup><mi>h</mi> <mn>2</mn></msup> </mrow> </math> is unclear, particularly in the presence of admixture and assortative mating. Here we use analytical theory and simulations to describe the behavior of <math> <mrow><msup><mi>h</mi> <mn>2</mn></msup> </mrow> </math> and three widely used random-effect estimators of <math> <mrow><msubsup><mi>h</mi> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </math> - Genome-wide restricted maximum likelihood (GREML), Haseman-Elston (HE) regression, and LD score regression (LDSC) - in admixed populations. We show that <math> <mrow><msubsup><mi>h</mi> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </math> estimates can be biased in admixed populations, even if all causal variants are genotyped and in the absence of confounding due to shared environment. This is largely because admixture generates directional LD, which contributes to the genetic variance, and therefore to heritability. Random-effect estimators of <math> <mrow><msubsup><mi>h</mi> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </math> , because they assume that SNP effects are independent, do not capture the contribution, which can be positive or negative depending on the genetic architecture, leading to under- or over-estimates of <math> <mrow><msubsup><mi>h</mi> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </math> relative to <math> <mrow><msup><mi>h</mi> <mn>2</mn></msup> </mrow> </math> . For the same reason, estimates of local ancestry heritability <math> <mrow> <mfenced> <mrow> <msubsup><mover><mi>h</mi> <mo>^</mo></mover> <mi>γ</mi> <mn>2</mn></msubsup> </mrow> </mfenced> </mrow> </math> are also biased in the presence of directional LD. We describe this bias in <math> <mrow> <msubsup><mover><mi>h</mi> <mo>^</mo></mover> <mrow><mi>s</mi> <mi>n</mi> <mi>p</mi></mrow> <mn>2</mn></msubsup> </mrow> </math> and <math> <mrow> <msubsup><mover><mi>h</mi> <mo>^</mo></mover> <mi>γ</mi> <mn>2</mn></msubsup> </mrow> </math> as a function of admixture history and the genetic architecture of the trait, clarifying their interpretation and implication for genome-wide association studies and polygenic prediction in admixed populations.</p>","PeriodicalId":72407,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10418213/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2023.08.04.551959","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

SNP heritability h s n p 2 is defined as the proportion of phenotypic variance explained by genotyped SNPs and is believed to be a lower bound of heritability h 2 , being equal to it if all causal variants are genotyped. Despite the simple intuition behind h s n p 2 , its interpretation and equivalence to h 2 is unclear, particularly in the presence of admixture and assortative mating. Here we use analytical theory and simulations to describe the behavior of h 2 and three widely used random-effect estimators of h s n p 2 - Genome-wide restricted maximum likelihood (GREML), Haseman-Elston (HE) regression, and LD score regression (LDSC) - in admixed populations. We show that h s n p 2 estimates can be biased in admixed populations, even if all causal variants are genotyped and in the absence of confounding due to shared environment. This is largely because admixture generates directional LD, which contributes to the genetic variance, and therefore to heritability. Random-effect estimators of h s n p 2 , because they assume that SNP effects are independent, do not capture the contribution, which can be positive or negative depending on the genetic architecture, leading to under- or over-estimates of h s n p 2 relative to h 2 . For the same reason, estimates of local ancestry heritability h ^ γ 2 are also biased in the presence of directional LD. We describe this bias in h ^ s n p 2 and h ^ γ 2 as a function of admixture history and the genetic architecture of the trait, clarifying their interpretation and implication for genome-wide association studies and polygenic prediction in admixed populations.

Abstract Image

Abstract Image

Abstract Image

解释混合群体中SNP的遗传力。
SNP遗传力hsnp2被定义为由基因型SNPs解释的表型方差的比例,并且被认为是遗传力h2的下限,如果所有因果变异都已知,则与之相等。尽管hsnp2背后有着简单的直觉,但它对h2的解释和等价性尚不清楚,尤其是在种群结构和分类交配的情况下。众所周知,人口结构会导致hõsnp2估计中的通货膨胀。在这里,我们使用分析理论和模拟来证明,即使在没有混淆的情况下,即使因果变异是已知的,在混合人群中,hsnp2的估计值也不能保证等于h2。我们解释这种差异不是因为估计有偏差,而是因为在随机效应模型下定义的估计本身可能不等于h2。该模型假设SNP效应是不相关的,这可能不是真的,即使对于混合和结构化群体中的未连接基因座,也会导致hsnp2相对于h2的估计过高或过低。出于同样的原因,本地祖先遗传力hγ2也可能不等于混合群体中由本地祖先解释的方差。我们将hsnp2和hγ2的数量行为描述为性状的混合史和遗传结构的函数,并讨论其对全基因组关联和多基因预测的意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信