A. Kraja, E. W. Daw, P. Lenzini, Lihua Wang, Shiow J. Lin, Christine A. Williams, Alan B. Wells, K. Lunetta, J. Murabito, P. Sebastiani, G. Tosto, S. Barral, R. Minster, A. Yashin, T. Perls, M. Province
{"title":"A comparison of genetic imputation methods using Long Life Family Study genotypes and sequence data with the 1000 Genome reference panel","authors":"A. Kraja, E. W. Daw, P. Lenzini, Lihua Wang, Shiow J. Lin, Christine A. Williams, Alan B. Wells, K. Lunetta, J. Murabito, P. Sebastiani, G. Tosto, S. Barral, R. Minster, A. Yashin, T. Perls, M. Province","doi":"10.1504/ijbra.2020.10026541","DOIUrl":null,"url":null,"abstract":"This study compares methods of imputing genetic markers, given a typed GWAS scaffold from the Long Life Family Study (LLFS) and latest reference panel of 1000-Genomes. We examined two programs for pre-phasing haplotypes MACH/SHAPEIT2 and MINIMAC/IMPUTE2 for imputation. SHAPEIT2 is advantageous for haplotype pre-phasing. MINIMAC and IMPUTE2 produced similar imputation quality. We used a 4MB region on chromosome 2 of LLFS and in the Supplement, we compared methods using chromosome 19 data from the Genetic Analysis Workshop-19. IMPUTE2 had the advantage of using two references 1000G and a sequence for a subset of subjects. SHAPEIT2 and IMPUTE2 were used to finalise the full LLFS autosome imputation. In LLFS, 44% of ~80M autosomal imputed variants showed good imputation quality (info ≥ 0.30). Low imputation quality was associated with a predominantly low allele frequency in 1000-Genomes. New emerging large-scale sequences and enhanced imputation methodologies will further improve imputation quality.","PeriodicalId":434900,"journal":{"name":"Int. J. Bioinform. Res. Appl.","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Bioinform. Res. Appl.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijbra.2020.10026541","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This study compares methods of imputing genetic markers, given a typed GWAS scaffold from the Long Life Family Study (LLFS) and latest reference panel of 1000-Genomes. We examined two programs for pre-phasing haplotypes MACH/SHAPEIT2 and MINIMAC/IMPUTE2 for imputation. SHAPEIT2 is advantageous for haplotype pre-phasing. MINIMAC and IMPUTE2 produced similar imputation quality. We used a 4MB region on chromosome 2 of LLFS and in the Supplement, we compared methods using chromosome 19 data from the Genetic Analysis Workshop-19. IMPUTE2 had the advantage of using two references 1000G and a sequence for a subset of subjects. SHAPEIT2 and IMPUTE2 were used to finalise the full LLFS autosome imputation. In LLFS, 44% of ~80M autosomal imputed variants showed good imputation quality (info ≥ 0.30). Low imputation quality was associated with a predominantly low allele frequency in 1000-Genomes. New emerging large-scale sequences and enhanced imputation methodologies will further improve imputation quality.
本研究比较了来自长寿家族研究(Long Life Family study, LLFS)的分型GWAS支架和最新的1000个基因组参考面板的遗传标记输入方法。我们检测了两种预相位单倍型MACH/SHAPEIT2和MINIMAC/IMPUTE2的程序进行了代入。SHAPEIT2有利于单倍型预相位。MINIMAC和IMPUTE2产生相似的输入质量。我们在LLFS的2号染色体上使用了一个4MB的区域,在补充中,我们使用来自遗传分析车间-19的19号染色体数据比较了方法。IMPUTE2的优点是使用了两个参考文献1000G和一个被试子集的序列。使用SHAPEIT2和IMPUTE2完成完整的LLFS自动基因插入。在LLFS中,约80M常染色体归因变异中有44%表现出良好的归因质量(信息≥0.30)。在1000个基因组中,低输入质量主要与低等位基因频率相关。新出现的大规模序列和改进的代入方法将进一步提高代入质量。