Haplotype-resolved and gap-free genome of a floating aquatic plant from the Oryzeae tribe, Hygroryza aristata.

IF 1.9 Q3 GENETICS & HEREDITY
Li-Yao Yang, Li-Kun Huang, Jin-Bin Lin, Cun-Jing Xu, Wei-Qi Tang, Bi-Guang Huang
{"title":"Haplotype-resolved and gap-free genome of a floating aquatic plant from the Oryzeae tribe, Hygroryza aristata.","authors":"Li-Yao Yang, Li-Kun Huang, Jin-Bin Lin, Cun-Jing Xu, Wei-Qi Tang, Bi-Guang Huang","doi":"10.1186/s12863-025-01314-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Hygroryza aristata, an aquatic plant native to Southeast Asia, shows a high degree of adaptability to aquatic environments. H. aristata, which belongs to the Oryzeae tribe and is closely related to rice (Oryza sativa), holds potential for crop improvement, particularly in flood tolerance. This study aimed to sequence and assemble the genome of H. aristata.</p><p><strong>Data description: </strong>We assembled the genome of H. aristata using 31.91 Gb of Pacific Biosciences (PacBio) High-fidelity (HiFi) data and 22.36 Gb of ultra long Oxford Nanopore Technology (ONT) data, resulting in two gap-free haplotype genomes, hap1 (349.74 Mb) and hap2 (347.98 Mb), each with 12 chromosomes and 23 telomeres. The continuity of chromosomes was supported by High-throughput chromosome conformation capture (Hi-C) data. The assemblies demonstrated high completeness, with > 99.8% of coverage rates, 98.4% of Benchmarking Universal Single-Copy Orthologs (BUSCO) scores, and > 11.0 of Long Terminal Repeat Assembly Index (LAI) scores per haplotype. RNA sequencing (RNA-seq) data (176.06 Gb) of six tissues was generated for genome annotation, identifying 39,139 and 38,746 protein-coding genes in hap1 and hap2, respectively.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"23"},"PeriodicalIF":1.9000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11951829/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC genomic data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s12863-025-01314-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: Hygroryza aristata, an aquatic plant native to Southeast Asia, shows a high degree of adaptability to aquatic environments. H. aristata, which belongs to the Oryzeae tribe and is closely related to rice (Oryza sativa), holds potential for crop improvement, particularly in flood tolerance. This study aimed to sequence and assemble the genome of H. aristata.

Data description: We assembled the genome of H. aristata using 31.91 Gb of Pacific Biosciences (PacBio) High-fidelity (HiFi) data and 22.36 Gb of ultra long Oxford Nanopore Technology (ONT) data, resulting in two gap-free haplotype genomes, hap1 (349.74 Mb) and hap2 (347.98 Mb), each with 12 chromosomes and 23 telomeres. The continuity of chromosomes was supported by High-throughput chromosome conformation capture (Hi-C) data. The assemblies demonstrated high completeness, with > 99.8% of coverage rates, 98.4% of Benchmarking Universal Single-Copy Orthologs (BUSCO) scores, and > 11.0 of Long Terminal Repeat Assembly Index (LAI) scores per haplotype. RNA sequencing (RNA-seq) data (176.06 Gb) of six tissues was generated for genome annotation, identifying 39,139 and 38,746 protein-coding genes in hap1 and hap2, respectively.

来自稻科的一种漂浮水生植物的单倍型解析和无缺口基因组。
目的:水螅(Hygroryza aristata)是一种原产于东南亚的水生植物,对水生环境具有高度的适应性。H. aristata属于稻科,与水稻(Oryza sativa)密切相关,具有作物改良的潜力,特别是在抗洪能力方面。本研究的目的是测序和组装马兜木的基因组。数据描述:利用太平洋生物科学公司(PacBio)的31.91 Gb高保真(HiFi)数据和超长牛津纳米孔技术公司(ONT)的22.36 Gb数据,对柽柽树进行基因组组装,得到两个无间隙单倍型基因组,hap1 (349.74 Mb)和hap2 (347.98 Mb),每个基因组有12条染色体和23个端粒。高通量染色体构象捕获(Hi-C)数据支持染色体的连续性。结果表明,该组合具有较高的完整性,每个单倍型的覆盖率为99.8%,基准通用单拷贝Orthologs (BUSCO)得分为98.4%,长末端重复组装指数(LAI)得分为11.0。生成6个组织的RNA测序(RNA-seq)数据(176.06 Gb)进行基因组注释,分别在hap1和hap2中鉴定出39,139和38,746个蛋白质编码基因。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.90
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信