Assessing Genotype Imputation Methods for Low-Coverage Sequencing Data in Populations With Differing Relatedness and Inbreeding Levels.

IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Tram Vi, Katarina C Stuart, Hui Zhen Tan, Audald Lloret-Villas, Anna W Santure
{"title":"Assessing Genotype Imputation Methods for Low-Coverage Sequencing Data in Populations With Differing Relatedness and Inbreeding Levels.","authors":"Tram Vi, Katarina C Stuart, Hui Zhen Tan, Audald Lloret-Villas, Anna W Santure","doi":"10.1111/1755-0998.70049","DOIUrl":null,"url":null,"abstract":"<p><p>Low-coverage sequencing (LCS) followed by genotype imputation has become a cost-efficient approach for obtaining whole-genome SNPs. Several imputation methods for LCS data have been developed over the last decade. However, comparisons of their accuracy in inferring missing genotypes and their effectiveness for downstream analysis such as population genetics have not been comprehensively studied. In the present study, we assessed the imputation performance of five different tools: GLIMPSE2, GeneImp, QUILT2, STITCH and Beagle5.4, using populations simulated by SLiM4 that represent different levels of genetic relatedness and inbreeding. Imputation accuracy was calculated at the level of variant, haplotype and sample. The effectiveness of using imputed genotypes in recovering genetic structure, relatedness, inbreeding coefficients and demographic history was subsequently evaluated. The imputation accuracy of different methods was further tested in a real population of 283 hihi (stitchbird) samples. Our results suggest a high accuracy of all the tested methods on populations with high levels of genetic relatedness. However, in populations with low relatedness, the imputation accuracy differed across different tools and impacted the results of some downstream analyses. The simulation and imputation pipeline presented here can help determine the most suitable imputation method for different population scenarios.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e70049"},"PeriodicalIF":5.5000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1111/1755-0998.70049","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Low-coverage sequencing (LCS) followed by genotype imputation has become a cost-efficient approach for obtaining whole-genome SNPs. Several imputation methods for LCS data have been developed over the last decade. However, comparisons of their accuracy in inferring missing genotypes and their effectiveness for downstream analysis such as population genetics have not been comprehensively studied. In the present study, we assessed the imputation performance of five different tools: GLIMPSE2, GeneImp, QUILT2, STITCH and Beagle5.4, using populations simulated by SLiM4 that represent different levels of genetic relatedness and inbreeding. Imputation accuracy was calculated at the level of variant, haplotype and sample. The effectiveness of using imputed genotypes in recovering genetic structure, relatedness, inbreeding coefficients and demographic history was subsequently evaluated. The imputation accuracy of different methods was further tested in a real population of 283 hihi (stitchbird) samples. Our results suggest a high accuracy of all the tested methods on populations with high levels of genetic relatedness. However, in populations with low relatedness, the imputation accuracy differed across different tools and impacted the results of some downstream analyses. The simulation and imputation pipeline presented here can help determine the most suitable imputation method for different population scenarios.

在不同亲缘性和近交水平的群体中评估低覆盖率测序数据的基因型代入方法。
低覆盖测序(LCS)和基因型插补已经成为获得全基因组snp的一种经济有效的方法。在过去的十年中,已经开发了几种LCS数据的估算方法。然而,它们在推断缺失基因型方面的准确性及其在下游分析(如群体遗传学)中的有效性的比较还没有得到全面的研究。在本研究中,我们评估了五种不同工具的代入性能:GLIMPSE2, GeneImp, QUILT2, STITCH和Beagle5.4,使用SLiM4模拟的群体,代表不同水平的遗传亲缘性和近交。在变异、单倍型和样本水平上计算插补精度。随后评估了利用输入基因型恢复遗传结构、亲缘性、近交系数和人口统计学历史的有效性。在283个针鸟样本的实际种群中进一步测试了不同方法的归算精度。我们的研究结果表明,所有测试方法对具有高水平遗传亲缘关系的人群具有很高的准确性。然而,在低亲缘关系的种群中,不同工具的代入精度不同,并影响了一些下游分析的结果。本文所提出的模拟和计算流程可以帮助确定最适合不同人口情景的计算方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Molecular Ecology Resources
Molecular Ecology Resources 生物-进化生物学
CiteScore
15.60
自引率
5.20%
发文量
170
审稿时长
3 months
期刊介绍: Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines. In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信