基于从头开始的转录组组装和注释预测双花鳞栉草中的环苷酸

Xi Liu , Linlin Cai , Zhiming Zhou , Peiming Huang , Zhonglu Ren
{"title":"基于从头开始的转录组组装和注释预测双花鳞栉草中的环苷酸","authors":"Xi Liu ,&nbsp;Linlin Cai ,&nbsp;Zhiming Zhou ,&nbsp;Peiming Huang ,&nbsp;Zhonglu Ren","doi":"10.1016/j.jhip.2024.06.003","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><p>There is a scarcity of transcriptome sequencing data available for the <em>Leptopetalum biflorum</em>, and numerous cyclotides remain undiscovered. It is urgent to establish a workflow based on <em>de novo</em> transcriptome assembly and make systematic prediction of cyclotides in <em>Leptopetalum biflorum</em>, to provide a reference for functional analysis of cyclotides.</p></div><div><h3>Methods</h3><p>In this study, we performed RNA-seq on roots, leaves, and flowers of <em>Leptopetalum biflorum</em> to obtain two sets of transcriptome data. The quality assessment of the sequencing was conducted using FastQC and MultiQC. <em>De novo</em> transcriptome assembly of <em>Leptopetalum biflorum</em> was carried out using Trinity, with assembly quality evaluated through the Read Support method and BUSCO tool analysis. The eggnog-mapper and Trinotate were used to annotate functional terms in GO and pathways in KEGG. The Transdecoder was utilized to predict ORFs and coding regions while SignalP software was employed to predict amino acid sequences containing signal peptides and signal peptide splicing sites. The mature protein sequences are subsequently used for cyclotide prediction in <em>Leptopetalum biflorum</em> via FindCRP 2.0 (Find Cyclotide Peptide), a cyclotide prediction tool developed by our team.</p></div><div><h3>Results</h3><p>Trinity assembled a total of 171,310 transcripts and 103,299 isoforms (genes). The average transcript length was 1139.89, while the average gene length was 780.87. Approximately 30% of the genes exhibited homology within other plant species. Among these genes, 23,265 (22.52%) were annotated into 41 GO terms at Level 2. The KEGG pathway annotation revealed that 23,682 genes (22.92%) contained 5171 KO annotations and were involved in 484 pathways. FindCRP predicted 17 potential cyclotides, among which 15 sequences had homologous genes; notably five potential cyclotides showed complete identity (100%) to their respective homologous genes. Additionally, two potential cyclotide sequences without any identified homologous demonstrated circle-forming ability based on the 3D structure prediction results.</p></div><div><h3>Conclusion</h3><p>In this study, we developed a <em>de novo</em> transcriptome assembly workflow for the identification of cyclotides using RNA-seq data from <em>Leptopetalum biflorum</em>. Our custom-built tool, FindCRP, was employed in this workflow to detect potential cyclotides. This meticulously designed workflow ensures the reproducibility and reliability of our study findings. We successfully performed transcript annotation and predicted putative cyclotides. These potential cyclotides show significant homology to known cyclotides.</p></div>","PeriodicalId":100787,"journal":{"name":"Journal of Holistic Integrative Pharmacy","volume":"5 2","pages":"Pages 103-112"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2707368824000323/pdfft?md5=685b54ead1fa41e2dca6c49fac4eb96e&pid=1-s2.0-S2707368824000323-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Cyclotides prediction in Leptopetalum biflorum based on de novo transcriptome assembly and annotation\",\"authors\":\"Xi Liu ,&nbsp;Linlin Cai ,&nbsp;Zhiming Zhou ,&nbsp;Peiming Huang ,&nbsp;Zhonglu Ren\",\"doi\":\"10.1016/j.jhip.2024.06.003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><p>There is a scarcity of transcriptome sequencing data available for the <em>Leptopetalum biflorum</em>, and numerous cyclotides remain undiscovered. It is urgent to establish a workflow based on <em>de novo</em> transcriptome assembly and make systematic prediction of cyclotides in <em>Leptopetalum biflorum</em>, to provide a reference for functional analysis of cyclotides.</p></div><div><h3>Methods</h3><p>In this study, we performed RNA-seq on roots, leaves, and flowers of <em>Leptopetalum biflorum</em> to obtain two sets of transcriptome data. The quality assessment of the sequencing was conducted using FastQC and MultiQC. <em>De novo</em> transcriptome assembly of <em>Leptopetalum biflorum</em> was carried out using Trinity, with assembly quality evaluated through the Read Support method and BUSCO tool analysis. The eggnog-mapper and Trinotate were used to annotate functional terms in GO and pathways in KEGG. The Transdecoder was utilized to predict ORFs and coding regions while SignalP software was employed to predict amino acid sequences containing signal peptides and signal peptide splicing sites. The mature protein sequences are subsequently used for cyclotide prediction in <em>Leptopetalum biflorum</em> via FindCRP 2.0 (Find Cyclotide Peptide), a cyclotide prediction tool developed by our team.</p></div><div><h3>Results</h3><p>Trinity assembled a total of 171,310 transcripts and 103,299 isoforms (genes). The average transcript length was 1139.89, while the average gene length was 780.87. Approximately 30% of the genes exhibited homology within other plant species. Among these genes, 23,265 (22.52%) were annotated into 41 GO terms at Level 2. The KEGG pathway annotation revealed that 23,682 genes (22.92%) contained 5171 KO annotations and were involved in 484 pathways. FindCRP predicted 17 potential cyclotides, among which 15 sequences had homologous genes; notably five potential cyclotides showed complete identity (100%) to their respective homologous genes. Additionally, two potential cyclotide sequences without any identified homologous demonstrated circle-forming ability based on the 3D structure prediction results.</p></div><div><h3>Conclusion</h3><p>In this study, we developed a <em>de novo</em> transcriptome assembly workflow for the identification of cyclotides using RNA-seq data from <em>Leptopetalum biflorum</em>. Our custom-built tool, FindCRP, was employed in this workflow to detect potential cyclotides. This meticulously designed workflow ensures the reproducibility and reliability of our study findings. We successfully performed transcript annotation and predicted putative cyclotides. These potential cyclotides show significant homology to known cyclotides.</p></div>\",\"PeriodicalId\":100787,\"journal\":{\"name\":\"Journal of Holistic Integrative Pharmacy\",\"volume\":\"5 2\",\"pages\":\"Pages 103-112\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2707368824000323/pdfft?md5=685b54ead1fa41e2dca6c49fac4eb96e&pid=1-s2.0-S2707368824000323-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Holistic Integrative Pharmacy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2707368824000323\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Holistic Integrative Pharmacy","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2707368824000323","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目的目前双花绣线菊的转录组测序数据稀缺,许多环位素仍未被发现。当务之急是建立一个基于全新转录组组装的工作流程,并对双花七叶树的环位素进行系统预测,为环位素的功能分析提供参考。方法本研究对双花七叶树的根、叶、花进行了 RNA-seq 分析,获得了两组转录组数据。使用 FastQC 和 MultiQC 对测序进行了质量评估。使用 Trinity 对双花七叶树的转录组进行了全新组装,并通过读数支持方法和 BUSCO 工具分析对组装质量进行了评估。eggnog-mapper 和 Trinotate 用于注释 GO 中的功能项和 KEGG 中的通路。Transdecoder 用于预测 ORF 和编码区,而 SignalP 软件则用于预测含有信号肽和信号肽剪接位点的氨基酸序列。随后,通过我们团队开发的环肽预测工具 FindCRP 2.0(Find Cyclotide Peptide),将成熟蛋白质序列用于双花栉水母的环肽预测。转录本平均长度为 1139.89,基因平均长度为 780.87。约 30% 的基因与其他植物物种存在同源性。在这些基因中,有 23265 个(22.52%)基因被注释为 41 个二级 GO 术语。KEGG 通路注释显示,23,682 个基因(22.92%)包含 5171 个 KO 注释,参与了 484 条通路。FindCRP 预测了 17 个潜在的环素,其中 15 个序列有同源基因;值得注意的是,有 5 个潜在的环素与各自的同源基因完全一致(100%)。此外,根据三维结构预测结果,两个没有同源基因的潜在环苷酸序列表现出了成圈能力。 结论在这项研究中,我们开发了一种从头开始的转录组组装工作流程,用于利用双花鳞片草的 RNA-seq 数据鉴定环苷酸。在这一工作流程中使用了我们定制的工具 FindCRP 来检测潜在的环肽。这一精心设计的工作流程确保了研究结果的可重复性和可靠性。我们成功地进行了转录本注释,并预测出了潜在的环肽。这些潜在的环苷酸与已知的环苷酸有明显的同源性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Cyclotides prediction in Leptopetalum biflorum based on de novo transcriptome assembly and annotation

Objective

There is a scarcity of transcriptome sequencing data available for the Leptopetalum biflorum, and numerous cyclotides remain undiscovered. It is urgent to establish a workflow based on de novo transcriptome assembly and make systematic prediction of cyclotides in Leptopetalum biflorum, to provide a reference for functional analysis of cyclotides.

Methods

In this study, we performed RNA-seq on roots, leaves, and flowers of Leptopetalum biflorum to obtain two sets of transcriptome data. The quality assessment of the sequencing was conducted using FastQC and MultiQC. De novo transcriptome assembly of Leptopetalum biflorum was carried out using Trinity, with assembly quality evaluated through the Read Support method and BUSCO tool analysis. The eggnog-mapper and Trinotate were used to annotate functional terms in GO and pathways in KEGG. The Transdecoder was utilized to predict ORFs and coding regions while SignalP software was employed to predict amino acid sequences containing signal peptides and signal peptide splicing sites. The mature protein sequences are subsequently used for cyclotide prediction in Leptopetalum biflorum via FindCRP 2.0 (Find Cyclotide Peptide), a cyclotide prediction tool developed by our team.

Results

Trinity assembled a total of 171,310 transcripts and 103,299 isoforms (genes). The average transcript length was 1139.89, while the average gene length was 780.87. Approximately 30% of the genes exhibited homology within other plant species. Among these genes, 23,265 (22.52%) were annotated into 41 GO terms at Level 2. The KEGG pathway annotation revealed that 23,682 genes (22.92%) contained 5171 KO annotations and were involved in 484 pathways. FindCRP predicted 17 potential cyclotides, among which 15 sequences had homologous genes; notably five potential cyclotides showed complete identity (100%) to their respective homologous genes. Additionally, two potential cyclotide sequences without any identified homologous demonstrated circle-forming ability based on the 3D structure prediction results.

Conclusion

In this study, we developed a de novo transcriptome assembly workflow for the identification of cyclotides using RNA-seq data from Leptopetalum biflorum. Our custom-built tool, FindCRP, was employed in this workflow to detect potential cyclotides. This meticulously designed workflow ensures the reproducibility and reliability of our study findings. We successfully performed transcript annotation and predicted putative cyclotides. These potential cyclotides show significant homology to known cyclotides.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信