Nicholas R. LaBonte, Dessireé P. Zerpa-Catanho, Siyao Liu, Liang Xiao, Hongxu Dong, Lindsay V. Clark, Erik J. Sacks
{"title":"利用基因分型测序数据提高外交物种基因图谱的精确度和准确性","authors":"Nicholas R. LaBonte, Dessireé P. Zerpa-Catanho, Siyao Liu, Liang Xiao, Hongxu Dong, Lindsay V. Clark, Erik J. Sacks","doi":"10.1111/gcbb.13167","DOIUrl":null,"url":null,"abstract":"<p>Genotyping-by-sequencing (GBS) is a widely used strategy for obtaining large numbers of genetic markers in model and non-model organisms. In crop plants, GBS-derived marker datasets are frequently used to perform quantitative trait locus (QTL) mapping. In some plant species, however, high heterozygosity and complex genome structure mean that researchers must use care in handling GBS data to conduct QTL mapping most effectively. Such outbred crops include most of the perennial grass and tree species used for bioenergy. To identify strategies for increasing accuracy and precision of QTL mapping using GBS data in outbred crops, we conducted an empirical study of SNP-calling and genetic map-building pipeline parameters in a <i>Miscanthus sinensis</i> population, and a complementary simulation study to estimate the relationship between genome-wide error rate, read depth, and marker number. The bioenergy grass <i>Miscanthus</i> is an obligate outcrossing species with a recent (diploidized) whole-genome duplication. For the study of empirical <i>M. sinensis</i> data, we compared two SNP-calling methods (one non-reference-based and one reference-based), a series of depth filters (12×, 20×, 30×, and 40×) and two map-construction methods (i.e., marker ordering: linkage-only and order-corrected based on a reference genome). We found that correcting the order of markers on a linkage map by using a high-quality reference genome improved QTL precision (shorter confidence intervals). For typical GBS datasets of between 1000 and 5000 markers to build a genetic map for biparental populations, a depth filter set at 30× to 40× applied to outbred populations provided a genome-wide genotype-calling error rate of less than 1%, improved accuracy of QTL point estimates and minimized type I errors for identifying QTL. Based on these results, we recommend using a reference genome to correct the marker order of genetic maps and a robust genotype depth filter to improve QTL mapping for outbred crops.</p>","PeriodicalId":55126,"journal":{"name":"Global Change Biology Bioenergy","volume":"16 7","pages":""},"PeriodicalIF":5.9000,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/gcbb.13167","citationCount":"0","resultStr":"{\"title\":\"Improving precision and accuracy of genetic mapping with genotyping-by-sequencing data in outcrossing species\",\"authors\":\"Nicholas R. LaBonte, Dessireé P. Zerpa-Catanho, Siyao Liu, Liang Xiao, Hongxu Dong, Lindsay V. Clark, Erik J. Sacks\",\"doi\":\"10.1111/gcbb.13167\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Genotyping-by-sequencing (GBS) is a widely used strategy for obtaining large numbers of genetic markers in model and non-model organisms. In crop plants, GBS-derived marker datasets are frequently used to perform quantitative trait locus (QTL) mapping. In some plant species, however, high heterozygosity and complex genome structure mean that researchers must use care in handling GBS data to conduct QTL mapping most effectively. Such outbred crops include most of the perennial grass and tree species used for bioenergy. To identify strategies for increasing accuracy and precision of QTL mapping using GBS data in outbred crops, we conducted an empirical study of SNP-calling and genetic map-building pipeline parameters in a <i>Miscanthus sinensis</i> population, and a complementary simulation study to estimate the relationship between genome-wide error rate, read depth, and marker number. The bioenergy grass <i>Miscanthus</i> is an obligate outcrossing species with a recent (diploidized) whole-genome duplication. For the study of empirical <i>M. sinensis</i> data, we compared two SNP-calling methods (one non-reference-based and one reference-based), a series of depth filters (12×, 20×, 30×, and 40×) and two map-construction methods (i.e., marker ordering: linkage-only and order-corrected based on a reference genome). We found that correcting the order of markers on a linkage map by using a high-quality reference genome improved QTL precision (shorter confidence intervals). For typical GBS datasets of between 1000 and 5000 markers to build a genetic map for biparental populations, a depth filter set at 30× to 40× applied to outbred populations provided a genome-wide genotype-calling error rate of less than 1%, improved accuracy of QTL point estimates and minimized type I errors for identifying QTL. Based on these results, we recommend using a reference genome to correct the marker order of genetic maps and a robust genotype depth filter to improve QTL mapping for outbred crops.</p>\",\"PeriodicalId\":55126,\"journal\":{\"name\":\"Global Change Biology Bioenergy\",\"volume\":\"16 7\",\"pages\":\"\"},\"PeriodicalIF\":5.9000,\"publicationDate\":\"2024-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/gcbb.13167\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Global Change Biology Bioenergy\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/gcbb.13167\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRONOMY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Global Change Biology Bioenergy","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/gcbb.13167","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
Improving precision and accuracy of genetic mapping with genotyping-by-sequencing data in outcrossing species
Genotyping-by-sequencing (GBS) is a widely used strategy for obtaining large numbers of genetic markers in model and non-model organisms. In crop plants, GBS-derived marker datasets are frequently used to perform quantitative trait locus (QTL) mapping. In some plant species, however, high heterozygosity and complex genome structure mean that researchers must use care in handling GBS data to conduct QTL mapping most effectively. Such outbred crops include most of the perennial grass and tree species used for bioenergy. To identify strategies for increasing accuracy and precision of QTL mapping using GBS data in outbred crops, we conducted an empirical study of SNP-calling and genetic map-building pipeline parameters in a Miscanthus sinensis population, and a complementary simulation study to estimate the relationship between genome-wide error rate, read depth, and marker number. The bioenergy grass Miscanthus is an obligate outcrossing species with a recent (diploidized) whole-genome duplication. For the study of empirical M. sinensis data, we compared two SNP-calling methods (one non-reference-based and one reference-based), a series of depth filters (12×, 20×, 30×, and 40×) and two map-construction methods (i.e., marker ordering: linkage-only and order-corrected based on a reference genome). We found that correcting the order of markers on a linkage map by using a high-quality reference genome improved QTL precision (shorter confidence intervals). For typical GBS datasets of between 1000 and 5000 markers to build a genetic map for biparental populations, a depth filter set at 30× to 40× applied to outbred populations provided a genome-wide genotype-calling error rate of less than 1%, improved accuracy of QTL point estimates and minimized type I errors for identifying QTL. Based on these results, we recommend using a reference genome to correct the marker order of genetic maps and a robust genotype depth filter to improve QTL mapping for outbred crops.
期刊介绍:
GCB Bioenergy is an international journal publishing original research papers, review articles and commentaries that promote understanding of the interface between biological and environmental sciences and the production of fuels directly from plants, algae and waste. The scope of the journal extends to areas outside of biology to policy forum, socioeconomic analyses, technoeconomic analyses and systems analysis. Papers do not need a global change component for consideration for publication, it is viewed as implicit that most bioenergy will be beneficial in avoiding at least a part of the fossil fuel energy that would otherwise be used.
Key areas covered by the journal:
Bioenergy feedstock and bio-oil production: energy crops and algae their management,, genomics, genetic improvements, planting, harvesting, storage, transportation, integrated logistics, production modeling, composition and its modification, pests, diseases and weeds of feedstocks. Manuscripts concerning alternative energy based on biological mimicry are also encouraged (e.g. artificial photosynthesis).
Biological Residues/Co-products: from agricultural production, forestry and plantations (stover, sugar, bio-plastics, etc.), algae processing industries, and municipal sources (MSW).
Bioenergy and the Environment: ecosystem services, carbon mitigation, land use change, life cycle assessment, energy and greenhouse gas balances, water use, water quality, assessment of sustainability, and biodiversity issues.
Bioenergy Socioeconomics: examining the economic viability or social acceptability of crops, crops systems and their processing, including genetically modified organisms [GMOs], health impacts of bioenergy systems.
Bioenergy Policy: legislative developments affecting biofuels and bioenergy.
Bioenergy Systems Analysis: examining biological developments in a whole systems context.