Detecting Chromosomal Inversions from Dense SNPs by Combining PCA and Association Tests

R. J. Nowling, S. Emrich
{"title":"Detecting Chromosomal Inversions from Dense SNPs by Combining PCA and Association Tests","authors":"R. J. Nowling, S. Emrich","doi":"10.1145/3233547.3233571","DOIUrl":null,"url":null,"abstract":"Principal Component Analysis (PCA) of dense single nucleotide polymorphism (SNP) data has wide-ranging applications in populations genetics, including detection of chromosomal inversions. SNPs associated with each PC can be identified through single-SNP association tests performed between SNP genotypes and PC coordinates; this approach has several advantages over thresholding loading factors or sparse PCA methods. Insect vector SNP data often have a high proportion of unknown (uncalled) genotypes, however, that cannot be reliably imputed and prevent the direct usage of association tests. Building on our previous work, we propose a novel method for adjusting the association tests to handle these unknown genotypes. We demonstrate the utility of the method through two applications: detecting chromosomal inversions and characterizing differentiation processed captured by PCA. When applied to SNP data from the 2L and 2R chromosome arms of 34 karyotyped Anopheles gambiae and Anopheles coluzzii mosquitoes, our method clearly identifies the 2La, 2Rb, 2Rc, 2Rj, and 2Ru inversions. Using our method to identify SNP associated with 2L-PC3, we observed one of the two insecticide-resistance variants in the Rdl gene; our results suggests that the PC is capturing differentiation driven by insecticide usage.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3233547.3233571","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Principal Component Analysis (PCA) of dense single nucleotide polymorphism (SNP) data has wide-ranging applications in populations genetics, including detection of chromosomal inversions. SNPs associated with each PC can be identified through single-SNP association tests performed between SNP genotypes and PC coordinates; this approach has several advantages over thresholding loading factors or sparse PCA methods. Insect vector SNP data often have a high proportion of unknown (uncalled) genotypes, however, that cannot be reliably imputed and prevent the direct usage of association tests. Building on our previous work, we propose a novel method for adjusting the association tests to handle these unknown genotypes. We demonstrate the utility of the method through two applications: detecting chromosomal inversions and characterizing differentiation processed captured by PCA. When applied to SNP data from the 2L and 2R chromosome arms of 34 karyotyped Anopheles gambiae and Anopheles coluzzii mosquitoes, our method clearly identifies the 2La, 2Rb, 2Rc, 2Rj, and 2Ru inversions. Using our method to identify SNP associated with 2L-PC3, we observed one of the two insecticide-resistance variants in the Rdl gene; our results suggests that the PC is capturing differentiation driven by insecticide usage.
结合PCA和关联试验检测密集snp的染色体倒位
密集单核苷酸多态性(SNP)数据的主成分分析(PCA)在群体遗传学中有着广泛的应用,包括染色体倒位的检测。通过在SNP基因型和PC坐标之间进行单SNP关联试验,可以确定与每个PC相关的SNP;与阈值加载因子或稀疏PCA方法相比,该方法具有几个优点。然而,昆虫载体SNP数据通常具有高比例的未知(未命名)基因型,无法可靠地估算并阻止直接使用关联试验。在我们之前工作的基础上,我们提出了一种新的方法来调整关联测试来处理这些未知的基因型。我们通过两个应用证明了该方法的实用性:检测染色体倒位和表征由PCA捕获的处理的分化。应用于34只冈比亚按蚊和科鲁兹按蚊的2L和2R染色体臂的SNP数据,我们的方法可以清楚地识别出2La、2Rb、2Rc、2Rj和2Ru的倒位。利用我们的方法鉴定与2L-PC3相关的SNP,我们在Rdl基因中观察到两个抗杀虫剂变异之一;我们的研究结果表明,PC正在捕获由杀虫剂使用驱动的分化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信