Recovering High-Quality Host Genomes from Gut Metagenomic Data through Genotype Imputation

Sofia Marcos, Melanie Parejo, Andone Estonba, Antton Alberdi
{"title":"Recovering High-Quality Host Genomes from Gut Metagenomic Data through Genotype Imputation","authors":"Sofia Marcos,&nbsp;Melanie Parejo,&nbsp;Andone Estonba,&nbsp;Antton Alberdi","doi":"10.1002/ggn2.202100065","DOIUrl":null,"url":null,"abstract":"<p>Metagenomic datasets of host-associated microbial communities often contain host DNA that is usually discarded because the amount of data is too low for accurate host genetic analyses. However, genotype imputation can be employed to reconstruct host genotypes if a reference panel is available. Here, the performance of a two-step strategy is tested to impute genotypes from four types of reference panels built using different strategies to low-depth host genome data (≈2× coverage) recovered from intestinal samples of two chicken genetic lines. First, imputation accuracy is evaluated in 12 samples for which both low- and high-depth sequencing data are available, obtaining high imputation accuracies for all tested panels (&gt;0.90). Second, the impact of reference panel choice in population genetics statistics on 100 chickens is assessed, all four panels yielding comparable results. In light of the observations, the feasibility and application of the applied imputation strategy are discussed for different species with regard to the host DNA proportion, genomic diversity, and availability of a reference panel. This method enables leveraging insofar discarded host DNA to get insights into the genetic structure of host populations, and in doing so, facilitates the implementation of hologenomic approaches that jointly analyze host and microbial genomic data.</p>","PeriodicalId":72071,"journal":{"name":"Advanced genetics (Hoboken, N.J.)","volume":"3 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9744478/pdf/","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced genetics (Hoboken, N.J.)","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ggn2.202100065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Metagenomic datasets of host-associated microbial communities often contain host DNA that is usually discarded because the amount of data is too low for accurate host genetic analyses. However, genotype imputation can be employed to reconstruct host genotypes if a reference panel is available. Here, the performance of a two-step strategy is tested to impute genotypes from four types of reference panels built using different strategies to low-depth host genome data (≈2× coverage) recovered from intestinal samples of two chicken genetic lines. First, imputation accuracy is evaluated in 12 samples for which both low- and high-depth sequencing data are available, obtaining high imputation accuracies for all tested panels (>0.90). Second, the impact of reference panel choice in population genetics statistics on 100 chickens is assessed, all four panels yielding comparable results. In light of the observations, the feasibility and application of the applied imputation strategy are discussed for different species with regard to the host DNA proportion, genomic diversity, and availability of a reference panel. This method enables leveraging insofar discarded host DNA to get insights into the genetic structure of host populations, and in doing so, facilitates the implementation of hologenomic approaches that jointly analyze host and microbial genomic data.

Abstract Image

通过基因型插入从肠道宏基因组数据中恢复高质量宿主基因组
宿主相关微生物群落的宏基因组数据集通常包含宿主DNA,这些DNA通常被丢弃,因为数据量太低,无法进行准确的宿主遗传分析。然而,基因型插入可以用来重建宿主基因型,如果有参考面板可用。本文测试了两步策略的性能,将使用不同策略构建的四种参考面板的基因型与从两个鸡遗传系的肠道样本中恢复的低深度宿主基因组数据(≈2倍覆盖率)相关联。首先,在12个样品中评估了低深度和高深度测序数据,获得了所有测试面板的高输入精度(>0.90)。其次,对100只鸡群体遗传统计中参考面板选择的影响进行了评估,所有四个面板都产生了可比较的结果。在此基础上,从宿主DNA比例、基因组多样性和参考面板的可用性等方面讨论了应用代入策略在不同物种中的可行性和应用。这种方法可以利用迄今为止丢弃的宿主DNA来深入了解宿主种群的遗传结构,并在此过程中促进了联合分析宿主和微生物基因组数据的全基因组学方法的实施。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信