{"title":"通过基于外显子组测序的基因型插补,有效识别英国生物银行队列中与性状相关的功能丧失变异","authors":"Wen-Yuan Yu, Shan-Shan Yan, Shu-Han Zhang, Jing-Jing Ni, Bin-Li, Yu-Fang Pei, Lei Zhang","doi":"10.1002/gepi.22511","DOIUrl":null,"url":null,"abstract":"<p>The large-scale open access whole-exome sequencing (WES) data of the UK Biobank ~200,000 participants is accelerating a new wave of genetic association studies aiming to identify rare and functional loss-of-function (LoF) variants associated with complex traits and diseases. We proposed to merge the WES genotypes and the genome-wide genotyping (GWAS) genotypes of 167,000 UKB homogeneous European participants into a combined reference panel, and then to impute 241,911 UKB homogeneous European participants who had the GWAS genotypes only. We then used the imputed data to replicate association identified in the discovery WES sample. The average imputation accuracy measure <i>r</i><sup>2</sup> is modest to high for LoF variants at all minor allele frequency intervals: 0.942 at MAF interval (0.01, 0.5), 0.807 at (1.0 × 10<sup>−3</sup>, 0.01), 0.805 at (1.0 × 10<sup>−4</sup>, 1.0 × 10<sup>−3</sup>), 0.664 at (1.0 × 10<sup>−5</sup>, 1.0 × 10<sup>−4</sup>) and 0.410 at (0, 1.0 × 10<sup>−5</sup>). As applications, we studied associations of LoF variants with estimated heel BMD and four lipid traits. In addition to replicating dozens of previously reported genes, we also identified three novel associations, two genes <i>PLIN1</i> and <i>ANGPTL3</i> for high-density-lipoprotein cholesterol and one gene <i>PDE3B</i> for triglycerides. Our results highlighted the strength of WES based genotype imputation as well as provided useful imputed data within the UKB cohort.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"47 2","pages":"121-134"},"PeriodicalIF":1.7000,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Efficient identification of trait-associated loss-of-function variants in the UK Biobank cohort by exome-sequencing based genotype imputation\",\"authors\":\"Wen-Yuan Yu, Shan-Shan Yan, Shu-Han Zhang, Jing-Jing Ni, Bin-Li, Yu-Fang Pei, Lei Zhang\",\"doi\":\"10.1002/gepi.22511\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The large-scale open access whole-exome sequencing (WES) data of the UK Biobank ~200,000 participants is accelerating a new wave of genetic association studies aiming to identify rare and functional loss-of-function (LoF) variants associated with complex traits and diseases. We proposed to merge the WES genotypes and the genome-wide genotyping (GWAS) genotypes of 167,000 UKB homogeneous European participants into a combined reference panel, and then to impute 241,911 UKB homogeneous European participants who had the GWAS genotypes only. We then used the imputed data to replicate association identified in the discovery WES sample. The average imputation accuracy measure <i>r</i><sup>2</sup> is modest to high for LoF variants at all minor allele frequency intervals: 0.942 at MAF interval (0.01, 0.5), 0.807 at (1.0 × 10<sup>−3</sup>, 0.01), 0.805 at (1.0 × 10<sup>−4</sup>, 1.0 × 10<sup>−3</sup>), 0.664 at (1.0 × 10<sup>−5</sup>, 1.0 × 10<sup>−4</sup>) and 0.410 at (0, 1.0 × 10<sup>−5</sup>). As applications, we studied associations of LoF variants with estimated heel BMD and four lipid traits. In addition to replicating dozens of previously reported genes, we also identified three novel associations, two genes <i>PLIN1</i> and <i>ANGPTL3</i> for high-density-lipoprotein cholesterol and one gene <i>PDE3B</i> for triglycerides. Our results highlighted the strength of WES based genotype imputation as well as provided useful imputed data within the UKB cohort.</p>\",\"PeriodicalId\":12710,\"journal\":{\"name\":\"Genetic Epidemiology\",\"volume\":\"47 2\",\"pages\":\"121-134\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2022-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genetic Epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/gepi.22511\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetic Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/gepi.22511","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
Efficient identification of trait-associated loss-of-function variants in the UK Biobank cohort by exome-sequencing based genotype imputation
The large-scale open access whole-exome sequencing (WES) data of the UK Biobank ~200,000 participants is accelerating a new wave of genetic association studies aiming to identify rare and functional loss-of-function (LoF) variants associated with complex traits and diseases. We proposed to merge the WES genotypes and the genome-wide genotyping (GWAS) genotypes of 167,000 UKB homogeneous European participants into a combined reference panel, and then to impute 241,911 UKB homogeneous European participants who had the GWAS genotypes only. We then used the imputed data to replicate association identified in the discovery WES sample. The average imputation accuracy measure r2 is modest to high for LoF variants at all minor allele frequency intervals: 0.942 at MAF interval (0.01, 0.5), 0.807 at (1.0 × 10−3, 0.01), 0.805 at (1.0 × 10−4, 1.0 × 10−3), 0.664 at (1.0 × 10−5, 1.0 × 10−4) and 0.410 at (0, 1.0 × 10−5). As applications, we studied associations of LoF variants with estimated heel BMD and four lipid traits. In addition to replicating dozens of previously reported genes, we also identified three novel associations, two genes PLIN1 and ANGPTL3 for high-density-lipoprotein cholesterol and one gene PDE3B for triglycerides. Our results highlighted the strength of WES based genotype imputation as well as provided useful imputed data within the UKB cohort.
期刊介绍:
Genetic Epidemiology is a peer-reviewed journal for discussion of research on the genetic causes of the distribution of human traits in families and populations. Emphasis is placed on the relative contribution of genetic and environmental factors to human disease as revealed by genetic, epidemiological, and biologic investigations.
Genetic Epidemiology primarily publishes papers in statistical genetics, a research field that is primarily concerned with development of statistical, bioinformatical, and computational models for analyzing genetic data. Incorporation of underlying biology and population genetics into conceptual models is favored. The Journal seeks original articles comprising either applied research or innovative statistical, mathematical, computational, or genomic methodologies that advance studies in genetic epidemiology. Other types of reports are encouraged, such as letters to the editor, topic reviews, and perspectives from other fields of research that will likely enrich the field of genetic epidemiology.