Yi Jiang, Minghan Qu, Minghui Jiang, Xuan Jiang, Shane Fernandez, T. Porter, Simon M. Laws, Colin L. Masters, Huan Guo, S.-M. Cheng, Chao Wang
{"title":"MethylGenotyper: Accurate Estimation of SNP Genotypes and Genetic Relatedness from DNA Methylation Data","authors":"Yi Jiang, Minghan Qu, Minghui Jiang, Xuan Jiang, Shane Fernandez, T. Porter, Simon M. Laws, Colin L. Masters, Huan Guo, S.-M. Cheng, Chao Wang","doi":"10.1093/gpbjnl/qzae044","DOIUrl":null,"url":null,"abstract":"\n Epigenome-wide association studies (EWAS) are susceptible to widespread confounding caused by population structure and genetic relatedness. Nevertheless, kinship estimation is challenging in EWAS without genotyping data. We proposed MethylGenotyper, a method that for the first time enables accurate genotyping at thousands of single nucleotide polymorphisms (SNPs) directly from commercial DNA methylation microarrays. We modeled the intensities of methylation probes near SNPs with a mixture of three beta distributions corresponding to different genotypes and estimated parameters with an expectation-maximization algorithm. We conducted extensive simulations to demonstrate the performance of the method. When applying MethylGenotyper to Infinium EPIC array data of 4662 Chinese, we obtained genotypes at 4319 SNPs with a concordance rate of 98.26%, enabling the identification of 255 pairs of close relatedness. Furthermore, we showed that MethylGenotyper allows for the estimation of both population structure and cryptic relatedness among 702 Australians of diverse ancestry. We have implemented MethylGenotyper in a publicly available R package (https://github.com/Yi-Jiang/MethylGenotyper) to facilitate future large-scale EWAS.","PeriodicalId":170516,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"101 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics, Proteomics & Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/gpbjnl/qzae044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Epigenome-wide association studies (EWAS) are susceptible to widespread confounding caused by population structure and genetic relatedness. Nevertheless, kinship estimation is challenging in EWAS without genotyping data. We proposed MethylGenotyper, a method that for the first time enables accurate genotyping at thousands of single nucleotide polymorphisms (SNPs) directly from commercial DNA methylation microarrays. We modeled the intensities of methylation probes near SNPs with a mixture of three beta distributions corresponding to different genotypes and estimated parameters with an expectation-maximization algorithm. We conducted extensive simulations to demonstrate the performance of the method. When applying MethylGenotyper to Infinium EPIC array data of 4662 Chinese, we obtained genotypes at 4319 SNPs with a concordance rate of 98.26%, enabling the identification of 255 pairs of close relatedness. Furthermore, we showed that MethylGenotyper allows for the estimation of both population structure and cryptic relatedness among 702 Australians of diverse ancestry. We have implemented MethylGenotyper in a publicly available R package (https://github.com/Yi-Jiang/MethylGenotyper) to facilitate future large-scale EWAS.