{"title":"Detecting Chromosomal Inversions from Dense SNPs by Combining PCA and Association Tests","authors":"R. J. Nowling, S. Emrich","doi":"10.1145/3233547.3233571","DOIUrl":null,"url":null,"abstract":"Principal Component Analysis (PCA) of dense single nucleotide polymorphism (SNP) data has wide-ranging applications in populations genetics, including detection of chromosomal inversions. SNPs associated with each PC can be identified through single-SNP association tests performed between SNP genotypes and PC coordinates; this approach has several advantages over thresholding loading factors or sparse PCA methods. Insect vector SNP data often have a high proportion of unknown (uncalled) genotypes, however, that cannot be reliably imputed and prevent the direct usage of association tests. Building on our previous work, we propose a novel method for adjusting the association tests to handle these unknown genotypes. We demonstrate the utility of the method through two applications: detecting chromosomal inversions and characterizing differentiation processed captured by PCA. When applied to SNP data from the 2L and 2R chromosome arms of 34 karyotyped Anopheles gambiae and Anopheles coluzzii mosquitoes, our method clearly identifies the 2La, 2Rb, 2Rc, 2Rj, and 2Ru inversions. Using our method to identify SNP associated with 2L-PC3, we observed one of the two insecticide-resistance variants in the Rdl gene; our results suggests that the PC is capturing differentiation driven by insecticide usage.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3233547.3233571","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Principal Component Analysis (PCA) of dense single nucleotide polymorphism (SNP) data has wide-ranging applications in populations genetics, including detection of chromosomal inversions. SNPs associated with each PC can be identified through single-SNP association tests performed between SNP genotypes and PC coordinates; this approach has several advantages over thresholding loading factors or sparse PCA methods. Insect vector SNP data often have a high proportion of unknown (uncalled) genotypes, however, that cannot be reliably imputed and prevent the direct usage of association tests. Building on our previous work, we propose a novel method for adjusting the association tests to handle these unknown genotypes. We demonstrate the utility of the method through two applications: detecting chromosomal inversions and characterizing differentiation processed captured by PCA. When applied to SNP data from the 2L and 2R chromosome arms of 34 karyotyped Anopheles gambiae and Anopheles coluzzii mosquitoes, our method clearly identifies the 2La, 2Rb, 2Rc, 2Rj, and 2Ru inversions. Using our method to identify SNP associated with 2L-PC3, we observed one of the two insecticide-resistance variants in the Rdl gene; our results suggests that the PC is capturing differentiation driven by insecticide usage.