利用基因分型测序数据提高外交物种基因图谱的精确度和准确性

IF 5.9 3区 工程技术 Q1 AGRONOMY
Nicholas R. LaBonte, Dessireé P. Zerpa-Catanho, Siyao Liu, Liang Xiao, Hongxu Dong, Lindsay V. Clark, Erik J. Sacks
{"title":"利用基因分型测序数据提高外交物种基因图谱的精确度和准确性","authors":"Nicholas R. LaBonte,&nbsp;Dessireé P. Zerpa-Catanho,&nbsp;Siyao Liu,&nbsp;Liang Xiao,&nbsp;Hongxu Dong,&nbsp;Lindsay V. Clark,&nbsp;Erik J. Sacks","doi":"10.1111/gcbb.13167","DOIUrl":null,"url":null,"abstract":"<p>Genotyping-by-sequencing (GBS) is a widely used strategy for obtaining large numbers of genetic markers in model and non-model organisms. In crop plants, GBS-derived marker datasets are frequently used to perform quantitative trait locus (QTL) mapping. In some plant species, however, high heterozygosity and complex genome structure mean that researchers must use care in handling GBS data to conduct QTL mapping most effectively. Such outbred crops include most of the perennial grass and tree species used for bioenergy. To identify strategies for increasing accuracy and precision of QTL mapping using GBS data in outbred crops, we conducted an empirical study of SNP-calling and genetic map-building pipeline parameters in a <i>Miscanthus sinensis</i> population, and a complementary simulation study to estimate the relationship between genome-wide error rate, read depth, and marker number. The bioenergy grass <i>Miscanthus</i> is an obligate outcrossing species with a recent (diploidized) whole-genome duplication. For the study of empirical <i>M. sinensis</i> data, we compared two SNP-calling methods (one non-reference-based and one reference-based), a series of depth filters (12×, 20×, 30×, and 40×) and two map-construction methods (i.e., marker ordering: linkage-only and order-corrected based on a reference genome). We found that correcting the order of markers on a linkage map by using a high-quality reference genome improved QTL precision (shorter confidence intervals). For typical GBS datasets of between 1000 and 5000 markers to build a genetic map for biparental populations, a depth filter set at 30× to 40× applied to outbred populations provided a genome-wide genotype-calling error rate of less than 1%, improved accuracy of QTL point estimates and minimized type I errors for identifying QTL. Based on these results, we recommend using a reference genome to correct the marker order of genetic maps and a robust genotype depth filter to improve QTL mapping for outbred crops.</p>","PeriodicalId":55126,"journal":{"name":"Global Change Biology Bioenergy","volume":"16 7","pages":""},"PeriodicalIF":5.9000,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/gcbb.13167","citationCount":"0","resultStr":"{\"title\":\"Improving precision and accuracy of genetic mapping with genotyping-by-sequencing data in outcrossing species\",\"authors\":\"Nicholas R. LaBonte,&nbsp;Dessireé P. Zerpa-Catanho,&nbsp;Siyao Liu,&nbsp;Liang Xiao,&nbsp;Hongxu Dong,&nbsp;Lindsay V. Clark,&nbsp;Erik J. Sacks\",\"doi\":\"10.1111/gcbb.13167\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Genotyping-by-sequencing (GBS) is a widely used strategy for obtaining large numbers of genetic markers in model and non-model organisms. In crop plants, GBS-derived marker datasets are frequently used to perform quantitative trait locus (QTL) mapping. In some plant species, however, high heterozygosity and complex genome structure mean that researchers must use care in handling GBS data to conduct QTL mapping most effectively. Such outbred crops include most of the perennial grass and tree species used for bioenergy. To identify strategies for increasing accuracy and precision of QTL mapping using GBS data in outbred crops, we conducted an empirical study of SNP-calling and genetic map-building pipeline parameters in a <i>Miscanthus sinensis</i> population, and a complementary simulation study to estimate the relationship between genome-wide error rate, read depth, and marker number. The bioenergy grass <i>Miscanthus</i> is an obligate outcrossing species with a recent (diploidized) whole-genome duplication. For the study of empirical <i>M. sinensis</i> data, we compared two SNP-calling methods (one non-reference-based and one reference-based), a series of depth filters (12×, 20×, 30×, and 40×) and two map-construction methods (i.e., marker ordering: linkage-only and order-corrected based on a reference genome). We found that correcting the order of markers on a linkage map by using a high-quality reference genome improved QTL precision (shorter confidence intervals). For typical GBS datasets of between 1000 and 5000 markers to build a genetic map for biparental populations, a depth filter set at 30× to 40× applied to outbred populations provided a genome-wide genotype-calling error rate of less than 1%, improved accuracy of QTL point estimates and minimized type I errors for identifying QTL. Based on these results, we recommend using a reference genome to correct the marker order of genetic maps and a robust genotype depth filter to improve QTL mapping for outbred crops.</p>\",\"PeriodicalId\":55126,\"journal\":{\"name\":\"Global Change Biology Bioenergy\",\"volume\":\"16 7\",\"pages\":\"\"},\"PeriodicalIF\":5.9000,\"publicationDate\":\"2024-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/gcbb.13167\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Global Change Biology Bioenergy\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/gcbb.13167\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRONOMY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Global Change Biology Bioenergy","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/gcbb.13167","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0

摘要

通过测序进行基因分型(GBS)是在模式生物和非模式生物中获得大量遗传标记的一种广泛应用的策略。在农作物中,GBS 衍生的标记数据集常用于进行数量性状基因座(QTL)作图。然而,在某些植物物种中,高杂合度和复杂的基因组结构意味着研究人员必须谨慎处理 GBS 数据,才能最有效地进行 QTL 测绘。这类近交作物包括大多数用于生物能源的多年生草木物种。为了确定提高利用 GBS 数据绘制外交作物 QTL 图谱的准确性和精确性的策略,我们在 Miscanthus sinensis 群体中进行了 SNP 调用和基因图谱构建管道参数的实证研究,并进行了补充模拟研究,以估计全基因组错误率、读取深度和标记数之间的关系。生物能源草Miscanthus是一种强制性外交物种,其全基因组最近(二倍体化)发生了重复。为了研究 M. sinensis 的经验数据,我们比较了两种 SNP 调用方法(一种是非参考方法,一种是参考方法)、一系列深度过滤器(12×、20×、30× 和 40×)和两种图谱构建方法(即标记排序:纯链接和基于参考基因组的顺序校正)。我们发现,通过使用高质量的参考基因组来校正连锁图上标记的顺序,可以提高 QTL 的精确度(缩短置信区间)。对于典型的 1000 到 5000 个标记的 GBS 数据集,为双亲种群构建遗传图谱时,将深度过滤器设置为 30× 至 40×,应用于外源种群,可使全基因组基因型调用错误率低于 1%,提高 QTL 点估计的准确性,并最大限度地减少识别 QTL 的 I 型误差。基于这些结果,我们建议使用参考基因组来校正遗传图谱的标记顺序,并使用稳健的基因型深度过滤器来改进外交作物的 QTL 图谱。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Improving precision and accuracy of genetic mapping with genotyping-by-sequencing data in outcrossing species

Improving precision and accuracy of genetic mapping with genotyping-by-sequencing data in outcrossing species

Genotyping-by-sequencing (GBS) is a widely used strategy for obtaining large numbers of genetic markers in model and non-model organisms. In crop plants, GBS-derived marker datasets are frequently used to perform quantitative trait locus (QTL) mapping. In some plant species, however, high heterozygosity and complex genome structure mean that researchers must use care in handling GBS data to conduct QTL mapping most effectively. Such outbred crops include most of the perennial grass and tree species used for bioenergy. To identify strategies for increasing accuracy and precision of QTL mapping using GBS data in outbred crops, we conducted an empirical study of SNP-calling and genetic map-building pipeline parameters in a Miscanthus sinensis population, and a complementary simulation study to estimate the relationship between genome-wide error rate, read depth, and marker number. The bioenergy grass Miscanthus is an obligate outcrossing species with a recent (diploidized) whole-genome duplication. For the study of empirical M. sinensis data, we compared two SNP-calling methods (one non-reference-based and one reference-based), a series of depth filters (12×, 20×, 30×, and 40×) and two map-construction methods (i.e., marker ordering: linkage-only and order-corrected based on a reference genome). We found that correcting the order of markers on a linkage map by using a high-quality reference genome improved QTL precision (shorter confidence intervals). For typical GBS datasets of between 1000 and 5000 markers to build a genetic map for biparental populations, a depth filter set at 30× to 40× applied to outbred populations provided a genome-wide genotype-calling error rate of less than 1%, improved accuracy of QTL point estimates and minimized type I errors for identifying QTL. Based on these results, we recommend using a reference genome to correct the marker order of genetic maps and a robust genotype depth filter to improve QTL mapping for outbred crops.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Global Change Biology Bioenergy
Global Change Biology Bioenergy AGRONOMY-ENERGY & FUELS
CiteScore
10.30
自引率
7.10%
发文量
96
审稿时长
1.5 months
期刊介绍: GCB Bioenergy is an international journal publishing original research papers, review articles and commentaries that promote understanding of the interface between biological and environmental sciences and the production of fuels directly from plants, algae and waste. The scope of the journal extends to areas outside of biology to policy forum, socioeconomic analyses, technoeconomic analyses and systems analysis. Papers do not need a global change component for consideration for publication, it is viewed as implicit that most bioenergy will be beneficial in avoiding at least a part of the fossil fuel energy that would otherwise be used. Key areas covered by the journal: Bioenergy feedstock and bio-oil production: energy crops and algae their management,, genomics, genetic improvements, planting, harvesting, storage, transportation, integrated logistics, production modeling, composition and its modification, pests, diseases and weeds of feedstocks. Manuscripts concerning alternative energy based on biological mimicry are also encouraged (e.g. artificial photosynthesis). Biological Residues/Co-products: from agricultural production, forestry and plantations (stover, sugar, bio-plastics, etc.), algae processing industries, and municipal sources (MSW). Bioenergy and the Environment: ecosystem services, carbon mitigation, land use change, life cycle assessment, energy and greenhouse gas balances, water use, water quality, assessment of sustainability, and biodiversity issues. Bioenergy Socioeconomics: examining the economic viability or social acceptability of crops, crops systems and their processing, including genetically modified organisms [GMOs], health impacts of bioenergy systems. Bioenergy Policy: legislative developments affecting biofuels and bioenergy. Bioenergy Systems Analysis: examining biological developments in a whole systems context.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信