Comprehensive Stress-Based De Novo Transcriptome Assembly and Annotation of Guar (Cyamopsis tetragonoloba (L.) Taub.): An Important Industrial and Forage Crop

IF 2.6 4区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY
F. Al-Qurainy, Aref Alshameri, A. Gaafar, Salim Khan, M. Nadeem, A. Alameri, M. Tarroum, M. Ashraf
{"title":"Comprehensive Stress-Based De Novo Transcriptome Assembly and Annotation of Guar (Cyamopsis tetragonoloba (L.) Taub.): An Important Industrial and Forage Crop","authors":"F. Al-Qurainy, Aref Alshameri, A. Gaafar, Salim Khan, M. Nadeem, A. Alameri, M. Tarroum, M. Ashraf","doi":"10.1155/2019/7295859","DOIUrl":null,"url":null,"abstract":"The forage crop Guar (Cyamopsis tetragonoloba (L.) Taub.) has the ability to endure heat, drought, and mild salinity. A complete image on its genic architecture will promote our understanding about gene expression networks and different tolerance mechanisms at the molecular level. Therefore, whole mRNA sequence approach on the Guar plant was conducted to provide a snapshot of the mRNA information in the cell under salinity, heat, and drought stresses to be integrated with previous transcriptomic studies. RNA-Seq technology was employed to perform a 2 × 100 paired-end sequencing using an Illumina HiSeq 2500 platform for the transcriptome of leaves of C. tetragonoloba under normal, heat, drought, and salinity conditions. Trinity was used to achieve a de novo assembly followed by gene annotation, functional classification, metabolic pathway analysis, and identification of SSR markers. A total of 218.2 million paired-end raw reads (~44 Gbp) were generated. Of those, 193.5M paired-end reads of high quality were used to reconstruct a total of 161,058 transcripts (~266 Mbp) with N50 of 2552 bp and 61,508 putative genes. There were 6463 proteins having >90% full-length coverage against the Swiss-Prot database and 94% complete orthologs against Embryophyta. Approximately, 62.87% of transcripts were blasted, 50.46% mapped, and 43.50% annotated. A total of 4715 InterProScan families, 3441 domains, 74 repeats, and 490 sites were detected. Biological processes, molecular functions, and cellular components comprised 64.12%, 25.42%, and 10.4%, respectively. The transcriptome was associated with 985 enzymes and 156 KEGG pathways. A total of 27,066 SSRs were gained with an average frequency of one SSR/9.825 kb in the assembled transcripts. This resulting data will be helpful for the advanced analysis of Guar to multi-stress tolerance.","PeriodicalId":13988,"journal":{"name":"International Journal of Genomics","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2019-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2019/7295859","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1155/2019/7295859","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 24

Abstract

The forage crop Guar (Cyamopsis tetragonoloba (L.) Taub.) has the ability to endure heat, drought, and mild salinity. A complete image on its genic architecture will promote our understanding about gene expression networks and different tolerance mechanisms at the molecular level. Therefore, whole mRNA sequence approach on the Guar plant was conducted to provide a snapshot of the mRNA information in the cell under salinity, heat, and drought stresses to be integrated with previous transcriptomic studies. RNA-Seq technology was employed to perform a 2 × 100 paired-end sequencing using an Illumina HiSeq 2500 platform for the transcriptome of leaves of C. tetragonoloba under normal, heat, drought, and salinity conditions. Trinity was used to achieve a de novo assembly followed by gene annotation, functional classification, metabolic pathway analysis, and identification of SSR markers. A total of 218.2 million paired-end raw reads (~44 Gbp) were generated. Of those, 193.5M paired-end reads of high quality were used to reconstruct a total of 161,058 transcripts (~266 Mbp) with N50 of 2552 bp and 61,508 putative genes. There were 6463 proteins having >90% full-length coverage against the Swiss-Prot database and 94% complete orthologs against Embryophyta. Approximately, 62.87% of transcripts were blasted, 50.46% mapped, and 43.50% annotated. A total of 4715 InterProScan families, 3441 domains, 74 repeats, and 490 sites were detected. Biological processes, molecular functions, and cellular components comprised 64.12%, 25.42%, and 10.4%, respectively. The transcriptome was associated with 985 enzymes and 156 KEGG pathways. A total of 27,066 SSRs were gained with an average frequency of one SSR/9.825 kb in the assembled transcripts. This resulting data will be helpful for the advanced analysis of Guar to multi-stress tolerance.
Guar(Cyamopsis tetragonoloba(L.)Taub.)的基于应激的De Novo转录组组装和注释:重要的工业和饲料作物
饲料作物瓜尔豆(Cyamopsis tetragonoloba)有能力忍受高温、干旱和轻微的盐度。完整的基因结构图谱将有助于我们在分子水平上对基因表达网络和不同耐受机制的认识。因此,我们对瓜尔豆植物进行了全mRNA序列分析,以提供盐、热、干旱胁迫下细胞mRNA信息的快照,并将其与之前的转录组学研究相结合。采用RNA-Seq技术,利用Illumina HiSeq 2500平台对正常、高温、干旱和盐度条件下的C. tetragonoloba叶片转录组进行2 × 100对端测序。利用Trinity进行从头组装,然后进行基因注释、功能分类、代谢途径分析和SSR标记鉴定。总共产生了2.182亿个对端原始reads (~44 Gbp)。其中,利用193.5M对端高质量reads重构了161,058个转录本(~266 Mbp), N50为2552 bp,推测基因61,508个。在Swiss-Prot数据库中,有6463个蛋白的全长覆盖率为90%,在Embryophyta数据库中有94%的完整同源物。62.87%的转录本被blast, 50.46%的转录本被定位,43.50%的转录本被注释。共检测到4715个InterProScan家族、3441个结构域、74个重复序列和490个位点。生物过程、分子功能和细胞成分分别占64.12%、25.42%和10.4%。该转录组与985个酶和156个KEGG通路相关。共获得27,066个SSR,平均频率为1个/9.825 kb。这些数据将为瓜尔豆的耐多应力性分析提供参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Genomics
International Journal of Genomics BIOCHEMISTRY & MOLECULAR BIOLOGY-BIOTECHNOLOGY & APPLIED MICROBIOLOGY
CiteScore
5.40
自引率
0.00%
发文量
33
审稿时长
17 weeks
期刊介绍: International Journal of Genomics is a peer-reviewed, Open Access journal that publishes research articles as well as review articles in all areas of genome-scale analysis. Topics covered by the journal include, but are not limited to: bioinformatics, clinical genomics, disease genomics, epigenomics, evolutionary genomics, functional genomics, genome engineering, and synthetic genomics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信