Aggregating SNPs Improves Filtering for False Positive Associations Post-Imputation.

IF 2.1 3区 生物学 Q3 GENETICS & HEREDITY
Katharina Stahl, Sergi Papiol, Monika Budde, Maria Heilbronner, Mojtaba Oraki Kohshour, Peter Falkai, Thomas G Schulze, Urs Heilbronner, Heike Bickeböller
{"title":"Aggregating SNPs Improves Filtering for False Positive Associations Post-Imputation.","authors":"Katharina Stahl, Sergi Papiol, Monika Budde, Maria Heilbronner, Mojtaba Oraki Kohshour, Peter Falkai, Thomas G Schulze, Urs Heilbronner, Heike Bickeböller","doi":"10.1093/g3journal/jkaf043","DOIUrl":null,"url":null,"abstract":"<p><p>Imputation causes bias in P-values in downstream genome-wide association studies. Imputation quality measures such as IMPUTE info are used to discriminate between false and true associations. However, implementing a high threshold often discards true associations, while a low threshold preserves false associations. This poses a challenge, especially for studies genotyped with SNP arrays. In practice, association signals register as spikes of low P-values for SNPs in close proximity owing to linkage disequilibrium, but post-imputation filtering is conducted on SNPs independently. We simulated 1536 small case-control studies on the human chromosome 19 both to quantify the introduced bias and to evaluate post-imputation filtering. The established IMPUTE info thresholds 0.3 and 0.8 were compared on individual SNPs and aggregated spikes in the formats 'best guess genotype' and 'dosage'. Furthermore, we applied two recently published methods, Iam hiQ and MagicalRsq, to assess their effect on filtering. We found differences in false signals and imputation quality between the genotype formats, especially in the midrange between thresholds. In this midrange, 51% and 60% of associated SNPs for best guess and dosage format, respectively, are true associations. For aggregated SNPs, the majority of spikes in the midrange are true associations. We propose a new method, the Midrange Filter, which uses both thresholds and formats to classify spikes instead of SNPs. This method discards up to the same number of false signals as the upper threshold, while preserving all true associations in most simulation settings. The PsyCourse study is included as a real data application.</p>","PeriodicalId":12468,"journal":{"name":"G3: Genes|Genomes|Genetics","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"G3: Genes|Genomes|Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/g3journal/jkaf043","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Imputation causes bias in P-values in downstream genome-wide association studies. Imputation quality measures such as IMPUTE info are used to discriminate between false and true associations. However, implementing a high threshold often discards true associations, while a low threshold preserves false associations. This poses a challenge, especially for studies genotyped with SNP arrays. In practice, association signals register as spikes of low P-values for SNPs in close proximity owing to linkage disequilibrium, but post-imputation filtering is conducted on SNPs independently. We simulated 1536 small case-control studies on the human chromosome 19 both to quantify the introduced bias and to evaluate post-imputation filtering. The established IMPUTE info thresholds 0.3 and 0.8 were compared on individual SNPs and aggregated spikes in the formats 'best guess genotype' and 'dosage'. Furthermore, we applied two recently published methods, Iam hiQ and MagicalRsq, to assess their effect on filtering. We found differences in false signals and imputation quality between the genotype formats, especially in the midrange between thresholds. In this midrange, 51% and 60% of associated SNPs for best guess and dosage format, respectively, are true associations. For aggregated SNPs, the majority of spikes in the midrange are true associations. We propose a new method, the Midrange Filter, which uses both thresholds and formats to classify spikes instead of SNPs. This method discards up to the same number of false signals as the upper threshold, while preserving all true associations in most simulation settings. The PsyCourse study is included as a real data application.

求助全文
约1分钟内获得全文 求助全文
来源期刊
G3: Genes|Genomes|Genetics
G3: Genes|Genomes|Genetics GENETICS & HEREDITY-
CiteScore
5.10
自引率
3.80%
发文量
305
审稿时长
3-8 weeks
期刊介绍: G3: Genes, Genomes, Genetics provides a forum for the publication of high‐quality foundational research, particularly research that generates useful genetic and genomic information such as genome maps, single gene studies, genome‐wide association and QTL studies, as well as genome reports, mutant screens, and advances in methods and technology. The Editorial Board of G3 believes that rapid dissemination of these data is the necessary foundation for analysis that leads to mechanistic insights. G3, published by the Genetics Society of America, meets the critical and growing need of the genetics community for rapid review and publication of important results in all areas of genetics. G3 offers the opportunity to publish the puzzling finding or to present unpublished results that may not have been submitted for review and publication due to a perceived lack of a potential high-impact finding. G3 has earned the DOAJ Seal, which is a mark of certification for open access journals, awarded by DOAJ to journals that achieve a high level of openness, adhere to Best Practice and high publishing standards.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信