在遗传多样性较低的自然种群中,使用一个小的参考群体可以获得较高的输入精度。

IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Hui Zhen Tan, Katarina C Stuart, Tram Vi, Annabel Whibley, Sarah Bailey, Patricia Brekke, Anna W Santure
{"title":"在遗传多样性较低的自然种群中,使用一个小的参考群体可以获得较高的输入精度。","authors":"Hui Zhen Tan, Katarina C Stuart, Tram Vi, Annabel Whibley, Sarah Bailey, Patricia Brekke, Anna W Santure","doi":"10.1111/1755-0998.70024","DOIUrl":null,"url":null,"abstract":"<p><p>Genotype imputation, the inference of missing genotypes using a reference set of population haplotypes, is a cost-effective tool for improving the quality and quantity of genetic datasets. Imputation is usually applied to large and well-characterised datasets of humans and livestock, even though it could also benefit smaller natural populations. This study aims to understand the best practices and effectiveness of imputation with a small reference panel for species with low genetic diversity, using a case study of a population of the hihi/stitchbird (Notiomystis cincta). We used a leave-one-out method to test imputation on 30 high-coverage hihi individuals where SNPs were masked before being imputed with Beagle v5.4. Imputation accuracy was measured using r<sup>2</sup>, the correlation between imputed and ground truth genotype dosages. We tested combinations of five imputation parameters, the inclusion of two linkage maps, reference panels of different sizes and compositions and targets of various SNP densities and sporadic missingness. We achieved mean r<sup>2</sup> exceeding 0.95 in most tests from a small reference panel of high-fecundity individuals. Imputation accuracy was not improved by including a linkage map and decreased at very low SNP densities. Imputed SNPs were filtered using r<sup>2</sup> to assess downstream heterozygosity calculations, the site frequency spectrum (SFS) and inference of runs of homozygosity (ROHs). We found that filtering and SNP density greatly affected heterozygosity and SFS at low SNP densities but that ROH inference was relatively robust to both. We provide a template for testing and optimising imputation in other wild populations.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e70024"},"PeriodicalIF":5.5000,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High Imputation Accuracy Can Be Achieved Using a Small Reference Panel in a Natural Population With Low Genetic Diversity.\",\"authors\":\"Hui Zhen Tan, Katarina C Stuart, Tram Vi, Annabel Whibley, Sarah Bailey, Patricia Brekke, Anna W Santure\",\"doi\":\"10.1111/1755-0998.70024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Genotype imputation, the inference of missing genotypes using a reference set of population haplotypes, is a cost-effective tool for improving the quality and quantity of genetic datasets. Imputation is usually applied to large and well-characterised datasets of humans and livestock, even though it could also benefit smaller natural populations. This study aims to understand the best practices and effectiveness of imputation with a small reference panel for species with low genetic diversity, using a case study of a population of the hihi/stitchbird (Notiomystis cincta). We used a leave-one-out method to test imputation on 30 high-coverage hihi individuals where SNPs were masked before being imputed with Beagle v5.4. Imputation accuracy was measured using r<sup>2</sup>, the correlation between imputed and ground truth genotype dosages. We tested combinations of five imputation parameters, the inclusion of two linkage maps, reference panels of different sizes and compositions and targets of various SNP densities and sporadic missingness. We achieved mean r<sup>2</sup> exceeding 0.95 in most tests from a small reference panel of high-fecundity individuals. Imputation accuracy was not improved by including a linkage map and decreased at very low SNP densities. Imputed SNPs were filtered using r<sup>2</sup> to assess downstream heterozygosity calculations, the site frequency spectrum (SFS) and inference of runs of homozygosity (ROHs). We found that filtering and SNP density greatly affected heterozygosity and SFS at low SNP densities but that ROH inference was relatively robust to both. We provide a template for testing and optimising imputation in other wild populations.</p>\",\"PeriodicalId\":211,\"journal\":{\"name\":\"Molecular Ecology Resources\",\"volume\":\" \",\"pages\":\"e70024\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Ecology Resources\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1111/1755-0998.70024\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1111/1755-0998.70024","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

基因型插入是利用参考群体单倍型推断缺失的基因型,是提高遗传数据集质量和数量的一种经济有效的工具。代入通常应用于人类和牲畜的大型和特征良好的数据集,即使它也可以使较小的自然种群受益。本研究旨在了解对低遗传多样性物种进行小型参考面板代入的最佳实践和有效性,并以一个高/缝鸟(Notiomystis cincta)种群为例进行研究。在使用Beagle v5.4进行估算之前,我们使用留一方法对30个高覆盖率的hi个体进行了估算,其中SNPs被掩盖。使用r2,即输入和真实基因型剂量之间的相关性来测量输入准确性。我们测试了五个输入参数的组合,包括两个连锁图,不同大小和组成的参考面板,以及不同SNP密度和零星缺失的目标。在一个小的高繁殖力个体参考小组中,我们在大多数测试中实现了平均r2超过0.95。包括连锁图谱并没有提高输入精度,并且在非常低的SNP密度下会降低。使用r2对输入的snp进行过滤,以评估下游杂合度计算、位点频谱(SFS)和纯合度推断(ROHs)。我们发现,在低SNP密度下,过滤和SNP密度对杂合度和SFS影响很大,但ROH推断对两者都相对稳健。我们为测试和优化其他野生种群的植入提供了模板。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
High Imputation Accuracy Can Be Achieved Using a Small Reference Panel in a Natural Population With Low Genetic Diversity.

Genotype imputation, the inference of missing genotypes using a reference set of population haplotypes, is a cost-effective tool for improving the quality and quantity of genetic datasets. Imputation is usually applied to large and well-characterised datasets of humans and livestock, even though it could also benefit smaller natural populations. This study aims to understand the best practices and effectiveness of imputation with a small reference panel for species with low genetic diversity, using a case study of a population of the hihi/stitchbird (Notiomystis cincta). We used a leave-one-out method to test imputation on 30 high-coverage hihi individuals where SNPs were masked before being imputed with Beagle v5.4. Imputation accuracy was measured using r2, the correlation between imputed and ground truth genotype dosages. We tested combinations of five imputation parameters, the inclusion of two linkage maps, reference panels of different sizes and compositions and targets of various SNP densities and sporadic missingness. We achieved mean r2 exceeding 0.95 in most tests from a small reference panel of high-fecundity individuals. Imputation accuracy was not improved by including a linkage map and decreased at very low SNP densities. Imputed SNPs were filtered using r2 to assess downstream heterozygosity calculations, the site frequency spectrum (SFS) and inference of runs of homozygosity (ROHs). We found that filtering and SNP density greatly affected heterozygosity and SFS at low SNP densities but that ROH inference was relatively robust to both. We provide a template for testing and optimising imputation in other wild populations.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Molecular Ecology Resources
Molecular Ecology Resources 生物-进化生物学
CiteScore
15.60
自引率
5.20%
发文量
170
审稿时长
3 months
期刊介绍: Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines. In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信