Near-complete assembly and comprehensive annotation of the wheat Chinese Spring genome.

IF 17.1 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Molecular Plant Pub Date : 2025-05-05 Epub Date: 2025-02-13 DOI:10.1016/j.molp.2025.02.002
Zijian Wang, Lingfeng Miao, Kaiwen Tan, Weilong Guo, Beibei Xin, Rudi Appels, Jizeng Jia, Jinsheng Lai, Fei Lu, Zhongfu Ni, Xiangdong Fu, Qixin Sun, Jian Chen
{"title":"Near-complete assembly and comprehensive annotation of the wheat Chinese Spring genome.","authors":"Zijian Wang, Lingfeng Miao, Kaiwen Tan, Weilong Guo, Beibei Xin, Rudi Appels, Jizeng Jia, Jinsheng Lai, Fei Lu, Zhongfu Ni, Xiangdong Fu, Qixin Sun, Jian Chen","doi":"10.1016/j.molp.2025.02.002","DOIUrl":null,"url":null,"abstract":"<p><p>A complete reference genome assembly is crucial for biological research and genetic improvement. Owing to its large size and highly repetitive nature, there are numerous gaps in the globally used wheat Chinese Spring (CS) genome assembly. In this study, we generated a 14.46 Gb near-complete assembly of the CS genome, with a contig N50 of over 266 Mb and an overall base accuracy of 99.9963%. Among the 290 gaps that remained (26, 257, and 7 gaps from the A, B, and D subgenomes, respectively), 278 were extremely high-copy tandem repeats, whereas the remaining 12 were transposable-element-associated gaps. Four chromosome assemblies were completely gap-free, including chr1D, chr3D, chr4D, and chr5D. Extensive annotation of the near-complete genome revealed 151 405 high-confidence genes, of which 59 180 were newly annotated, including 7602 newly assembled genes. Except for the centromere of chr1B, which has a gap associated with superlong GAA repeat arrays, the centromeric sequences of all of the remaining 20 chromosomes were completely assembled. Our near-complete assembly revealed that the extent of tandem repeats, such as simple-sequence repeats, was highly uneven among different subgenomes. Similarly, the repeat compositions of the centromeres also varied among the three subgenomes. With the genome sequences of all six types of seed storage proteins (SSPs) fully assembled, the expression of ω-gliadin was found to be contributed entirely by the B subgenome, whereas the expression of the other five types of SSPs was most abundant from the D subgenome. The near-complete CS genome will serve as a valuable resource for genomic and functional genomic research and breeding of wheat as well as its related species.</p>","PeriodicalId":19012,"journal":{"name":"Molecular Plant","volume":" ","pages":"892-907"},"PeriodicalIF":17.1000,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Plant","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.molp.2025.02.002","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

A complete reference genome assembly is crucial for biological research and genetic improvement. Owing to its large size and highly repetitive nature, there are numerous gaps in the globally used wheat Chinese Spring (CS) genome assembly. In this study, we generated a 14.46 Gb near-complete assembly of the CS genome, with a contig N50 of over 266 Mb and an overall base accuracy of 99.9963%. Among the 290 gaps that remained (26, 257, and 7 gaps from the A, B, and D subgenomes, respectively), 278 were extremely high-copy tandem repeats, whereas the remaining 12 were transposable-element-associated gaps. Four chromosome assemblies were completely gap-free, including chr1D, chr3D, chr4D, and chr5D. Extensive annotation of the near-complete genome revealed 151 405 high-confidence genes, of which 59 180 were newly annotated, including 7602 newly assembled genes. Except for the centromere of chr1B, which has a gap associated with superlong GAA repeat arrays, the centromeric sequences of all of the remaining 20 chromosomes were completely assembled. Our near-complete assembly revealed that the extent of tandem repeats, such as simple-sequence repeats, was highly uneven among different subgenomes. Similarly, the repeat compositions of the centromeres also varied among the three subgenomes. With the genome sequences of all six types of seed storage proteins (SSPs) fully assembled, the expression of ω-gliadin was found to be contributed entirely by the B subgenome, whereas the expression of the other five types of SSPs was most abundant from the D subgenome. The near-complete CS genome will serve as a valuable resource for genomic and functional genomic research and breeding of wheat as well as its related species.

中国小麦春季基因组的近完整组装与综合注释。
完整的参考基因组对生物学研究和遗传改良至关重要。由于其庞大的规模和高度重复的性质,在全球使用的小麦中国春(CS)基因组中存在许多空白。在这里,我们生成了14.46 Gb的接近完成的CS基因组组装,N50超过266 Mb,总体碱基精度为99.9963%。在剩余的290个缺口中(分别来自A、B和D亚基因组的缺口分别为26、257和7个),278个缺口是极高拷贝串联重复序列,其余12个是te相关的缺口。chr1D、chr3D、chr4D、chr5D 4条染色体完全无间隙。对接近完整的基因组进行大量注释,发现高置信度基因151405个,其中新注释的高置信度基因59180个,其中新组装的基因7602个。除chr1B的着丝粒存在与超长GAA重复序列相关的间隙外,其余20条染色体的着丝粒序列均已组装完成。我们接近完整的组装揭示了串联重复序列的程度,如SSRs,在不同的亚基因组中是高度不均匀的。同样,着丝粒的重复组成在三个亚基因组中也有所不同。结果表明,ω-麦胶蛋白的表达完全由B亚基因组贡献,其余5种ssp的表达主要来自D亚基因组。接近完整的CS基因组将为小麦及其相关品种的研究和育种提供宝贵的资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Molecular Plant
Molecular Plant 植物科学-生化与分子生物学
CiteScore
37.60
自引率
2.20%
发文量
1784
审稿时长
1 months
期刊介绍: Molecular Plant is dedicated to serving the plant science community by publishing novel and exciting findings with high significance in plant biology. The journal focuses broadly on cellular biology, physiology, biochemistry, molecular biology, genetics, development, plant-microbe interaction, genomics, bioinformatics, and molecular evolution. Molecular Plant publishes original research articles, reviews, Correspondence, and Spotlights on the most important developments in plant biology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信