Accurate, Scalable Structural Variant Genotyping in Complex Genomes at Population Scales.

IF 5.3 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Ming Hu, Penglong Wan, Chengjie Chen, Shuyuan Tang, Jiahao Chen, Liang Wang, Mahul Chakraborty, Yongfeng Zhou, Jinfeng Chen, Brandon S Gaut, J J Emerson, Yi Liao
{"title":"Accurate, Scalable Structural Variant Genotyping in Complex Genomes at Population Scales.","authors":"Ming Hu, Penglong Wan, Chengjie Chen, Shuyuan Tang, Jiahao Chen, Liang Wang, Mahul Chakraborty, Yongfeng Zhou, Jinfeng Chen, Brandon S Gaut, J J Emerson, Yi Liao","doi":"10.1093/molbev/msaf180","DOIUrl":null,"url":null,"abstract":"<p><p>Comparisons of complete genome assemblies offer a direct procedure for characterizing all genetic differences among them. However, existing tools are often limited to specific aligners or optimized for specific organisms, narrowing their applicability, particularly for large and repetitive plant genomes. Here, we introduce Structural Variants Genotyping of Assemblies on Population scales (SVGAP), a pipeline for structural variant (SV) discovery, genotyping, and annotation from high-quality genome assemblies at the population level. Through extensive benchmarks using simulated SV datasets at individual, population, and phylogenetic contexts, we demonstrate that SVGAP performs favorably relative to existing tools in SV discovery. Additionally, SVGAP is one of the few tools to address the challenge of genotyping SVs within large assembled genome samples, and it generates fully genotyped VCF files. Applying SVGAP to 26 maize genomes revealed hidden genomic diversity in centromeres, driven by abundant insertions of centromere-specific LTR-retrotransposons. The output of SVGAP is well-suited for pangenome construction and facilitates the interpretation of previously unexplored genomic regions.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12362251/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular biology and evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/molbev/msaf180","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Comparisons of complete genome assemblies offer a direct procedure for characterizing all genetic differences among them. However, existing tools are often limited to specific aligners or optimized for specific organisms, narrowing their applicability, particularly for large and repetitive plant genomes. Here, we introduce Structural Variants Genotyping of Assemblies on Population scales (SVGAP), a pipeline for structural variant (SV) discovery, genotyping, and annotation from high-quality genome assemblies at the population level. Through extensive benchmarks using simulated SV datasets at individual, population, and phylogenetic contexts, we demonstrate that SVGAP performs favorably relative to existing tools in SV discovery. Additionally, SVGAP is one of the few tools to address the challenge of genotyping SVs within large assembled genome samples, and it generates fully genotyped VCF files. Applying SVGAP to 26 maize genomes revealed hidden genomic diversity in centromeres, driven by abundant insertions of centromere-specific LTR-retrotransposons. The output of SVGAP is well-suited for pangenome construction and facilitates the interpretation of previously unexplored genomic regions.

精确的,可扩展的结构变异基因分型在复杂的基因组在群体规模。
完整基因组组装的比较提供了表征它们之间所有遗传差异的直接程序。然而,现有的工具往往局限于特定的校准或针对特定生物进行优化,从而缩小了它们的适用性,特别是对于大型和重复的植物基因组。在这里,我们介绍SVGAP,一个在群体水平上从高质量基因组组装中发现结构变异(SV)、基因分型和注释的管道。通过在个体、群体和系统发育背景下使用模拟SV数据集的广泛基准测试,我们证明SVGAP相对于现有工具在SV发现方面表现良好。此外,SVGAP是解决大型基因组样本中sv基因分型挑战的少数工具之一,它可以生成完整的基因分型VCF文件。将SVGAP应用于26个玉米基因组,揭示了着丝粒中隐藏的基因组多样性,这是由着丝粒特异性ltr -反转录转座子的大量插入驱动的。SVGAP的输出非常适合于泛基因组构建,并有助于解释以前未探索的基因组区域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Molecular biology and evolution
Molecular biology and evolution 生物-进化生物学
CiteScore
19.70
自引率
3.70%
发文量
257
审稿时长
1 months
期刊介绍: Molecular Biology and Evolution Journal Overview: Publishes research at the interface of molecular (including genomics) and evolutionary biology Considers manuscripts containing patterns, processes, and predictions at all levels of organization: population, taxonomic, functional, and phenotypic Interested in fundamental discoveries, new and improved methods, resources, technologies, and theories advancing evolutionary research Publishes balanced reviews of recent developments in genome evolution and forward-looking perspectives suggesting future directions in molecular evolution applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信