高蛋白大豆品种 HJ117 的全新基因组组装。

IF 1.9 Q3 GENETICS & HEREDITY

BMC genomic data Pub Date : 2024-03-04 DOI:10.1186/s12863-024-01213-1

Zhi Liu, Qing Yang, Bingqiang Liu, Chenhui Li, Xiaolei Shi, Yu Wei, Yuefeng Guan, Chunyan Yang, Mengchen Zhang, Long Yan

{"title":"高蛋白大豆品种 HJ117 的全新基因组组装。","authors":"Zhi Liu, Qing Yang, Bingqiang Liu, Chenhui Li, Xiaolei Shi, Yu Wei, Yuefeng Guan, Chunyan Yang, Mengchen Zhang, Long Yan","doi":"10.1186/s12863-024-01213-1","DOIUrl":null,"url":null,"abstract":"Objectives: Soybean is an important feed and oil crop in the world due to its high protein and oil content. China has a collection of more than 43,000 soybean germplasm resources, which provides a rich genetic diversity for soybean breeding. However, the rich genetic diversity poses great challenges to the genetic improvement of soybean. This study reports on the de novo genome assembly of HJ117, a soybean variety with high protein content of 52.99%. These data will prove to be valuable resources for further soybean quality improvement research, and will aid in the elucidation of regulatory mechanisms underlying soybean protein content.Data description: We generated a contiguous reference genome of 1041.94 Mb for HJ117 using a combination of Illumina short reads (23.38 Gb) and PacBio long reads (25.58 Gb), with high-quality sequence coverage of approximately 22.44× and 24.55×, respectively. HJ117 was developed through backcross breeding, using Jidou 12 as the recurrent parent and Chamoshidou as the donor parent. The assembly was further assisted by 114.5 Gb Hi-C data (109.9×), resulting in a contig N50 of 19.32 Mb and scaffold N50 of 51.43 Mb. Notably, Core Eukaryotic Genes Mapping Approach (CEGMA) assessment and Benchmarking Universal Single-Copy Orthologs (BUSCO) assessment results indicated that most core eukaryotic genes (97.18%) and genes in the BUSCO dataset (99.4%) were identified, and 96.44% of the genomic sequences were anchored onto twenty pseudochromosomes.","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"25"},"PeriodicalIF":1.9000,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10913422/pdf/","citationCount":"0","resultStr":"{\"title\":\"De novo genome assembly of a high-protein soybean variety HJ117.\",\"authors\":\"Zhi Liu, Qing Yang, Bingqiang Liu, Chenhui Li, Xiaolei Shi, Yu Wei, Yuefeng Guan, Chunyan Yang, Mengchen Zhang, Long Yan\",\"doi\":\"10.1186/s12863-024-01213-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objectives: Soybean is an important feed and oil crop in the world due to its high protein and oil content. China has a collection of more than 43,000 soybean germplasm resources, which provides a rich genetic diversity for soybean breeding. However, the rich genetic diversity poses great challenges to the genetic improvement of soybean. This study reports on the de novo genome assembly of HJ117, a soybean variety with high protein content of 52.99%. These data will prove to be valuable resources for further soybean quality improvement research, and will aid in the elucidation of regulatory mechanisms underlying soybean protein content.Data description: We generated a contiguous reference genome of 1041.94 Mb for HJ117 using a combination of Illumina short reads (23.38 Gb) and PacBio long reads (25.58 Gb), with high-quality sequence coverage of approximately 22.44× and 24.55×, respectively. HJ117 was developed through backcross breeding, using Jidou 12 as the recurrent parent and Chamoshidou as the donor parent. The assembly was further assisted by 114.5 Gb Hi-C data (109.9×), resulting in a contig N50 of 19.32 Mb and scaffold N50 of 51.43 Mb. Notably, Core Eukaryotic Genes Mapping Approach (CEGMA) assessment and Benchmarking Universal Single-Copy Orthologs (BUSCO) assessment results indicated that most core eukaryotic genes (97.18%) and genes in the BUSCO dataset (99.4%) were identified, and 96.44% of the genomic sequences were anchored onto twenty pseudochromosomes.\",\"PeriodicalId\":72427,\"journal\":{\"name\":\"BMC genomic data\",\"volume\":\"25 1\",\"pages\":\"25\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2024-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10913422/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC genomic data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s12863-024-01213-1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC genomic data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s12863-024-01213-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}

引用次数: 0

摘要

目标：大豆具有高蛋白和高含油量的特点，是世界上重要的饲料和油料作物。中国拥有大豆种质资源 43000 余份，为大豆育种提供了丰富的遗传多样性。然而，丰富的遗传多样性给大豆的遗传改良带来了巨大挑战。本研究报告了蛋白质含量高达 52.99% 的大豆品种 HJ117 的全新基因组组装。这些数据将被证明是进一步大豆品质改良研究的宝贵资源，并有助于阐明大豆蛋白质含量的调控机制：我们利用 Illumina 短读数（23.38 Gb）和 PacBio 长读数（25.58 Gb）组合为 HJ117 生成了 1041.94 Mb 的连续参考基因组，高质量序列覆盖率分别约为 22.44 倍和 24.55 倍。HJ117 是以 "地豆 12 号 "为复交亲本、"湛地豆 "为供体亲本，通过回交育种培育而成的。114.5 Gb的Hi-C数据（109.9倍）进一步帮助了该基因组的组装，从而产生了19.32 Mb的等位基因N50和51.43 Mb的支架N50。值得注意的是，核心真核基因绘图方法（CEGMA）评估和通用单拷贝同源物基准（BUSCO）评估结果表明，大多数核心真核基因（97.18%）和 BUSCO 数据集中的基因（99.4%）都被鉴定出来，96.44%的基因组序列被锚定在 20 个假染色体上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

De novo genome assembly of a high-protein soybean variety HJ117.

Objectives: Soybean is an important feed and oil crop in the world due to its high protein and oil content. China has a collection of more than 43,000 soybean germplasm resources, which provides a rich genetic diversity for soybean breeding. However, the rich genetic diversity poses great challenges to the genetic improvement of soybean. This study reports on the de novo genome assembly of HJ117, a soybean variety with high protein content of 52.99%. These data will prove to be valuable resources for further soybean quality improvement research, and will aid in the elucidation of regulatory mechanisms underlying soybean protein content.

Data description: We generated a contiguous reference genome of 1041.94 Mb for HJ117 using a combination of Illumina short reads (23.38 Gb) and PacBio long reads (25.58 Gb), with high-quality sequence coverage of approximately 22.44× and 24.55×, respectively. HJ117 was developed through backcross breeding, using Jidou 12 as the recurrent parent and Chamoshidou as the donor parent. The assembly was further assisted by 114.5 Gb Hi-C data (109.9×), resulting in a contig N50 of 19.32 Mb and scaffold N50 of 51.43 Mb. Notably, Core Eukaryotic Genes Mapping Approach (CEGMA) assessment and Benchmarking Universal Single-Copy Orthologs (BUSCO) assessment results indicated that most core eukaryotic genes (97.18%) and genes in the BUSCO dataset (99.4%) were identified, and 96.44% of the genomic sequences were anchored onto twenty pseudochromosomes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BMC genomic data

CiteScore

4.90

自引率

0.00%

发文量