UnigeneFinder:一个自动管道,从转录组组装的基因调用没有参考基因组。

IF 2.3 3区 生物学 Q2 PLANT SCIENCES
Plant Direct Pub Date : 2025-04-22 eCollection Date: 2025-04-01 DOI:10.1002/pld3.70056
Bo Xue, Karine Prado, Seung Yon Rhee, Matt Stata
{"title":"UnigeneFinder:一个自动管道,从转录组组装的基因调用没有参考基因组。","authors":"Bo Xue, Karine Prado, Seung Yon Rhee, Matt Stata","doi":"10.1002/pld3.70056","DOIUrl":null,"url":null,"abstract":"<p><p>For most species, transcriptome data are much more readily available than genome data. Without a reference genome, gene calling is cumbersome and inaccurate because of the high degree of redundancy in de novo transcriptome assemblies. To simplify and increase the accuracy of de novo transcriptome assembly in the absence of a reference genome, we developed UnigeneFinder. Combining several clustering methods, UnigeneFinder substantially reduces the redundancy typical of raw transcriptome assemblies. This pipeline offers an effective solution to the problem of inflated transcript numbers, achieving a closer representation of the actual underlying genome. UnigeneFinder performs comparably or better, compared with existing tools, on plant species with varying genome complexities. UnigeneFinder is the only available transcriptome redundancy solution that fully automates the generation of primary transcript, coding region, and protein sequences, analogous to those available for high-quality reference genomes. These features, coupled with the pipeline's cross-platform implementation, focus on automation, and an accessible, user-friendly interface, make UnigeneFinder a useful tool for many downstream sequence-based analyses in nonmodel organisms lacking a reference genome, including differential gene expression analysis, accurate ortholog identification, functional enrichments, and evolutionary analyses. UnigeneFinder also runs efficiently both on high-performance computing (HPC) systems and personal computers, further reducing barriers to use.</p>","PeriodicalId":20230,"journal":{"name":"Plant Direct","volume":"9 4","pages":"e70056"},"PeriodicalIF":2.3000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12012387/pdf/","citationCount":"0","resultStr":"{\"title\":\"UnigeneFinder: An Automated Pipeline for Gene Calling From Transcriptome Assemblies Without a Reference Genome.\",\"authors\":\"Bo Xue, Karine Prado, Seung Yon Rhee, Matt Stata\",\"doi\":\"10.1002/pld3.70056\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>For most species, transcriptome data are much more readily available than genome data. Without a reference genome, gene calling is cumbersome and inaccurate because of the high degree of redundancy in de novo transcriptome assemblies. To simplify and increase the accuracy of de novo transcriptome assembly in the absence of a reference genome, we developed UnigeneFinder. Combining several clustering methods, UnigeneFinder substantially reduces the redundancy typical of raw transcriptome assemblies. This pipeline offers an effective solution to the problem of inflated transcript numbers, achieving a closer representation of the actual underlying genome. UnigeneFinder performs comparably or better, compared with existing tools, on plant species with varying genome complexities. UnigeneFinder is the only available transcriptome redundancy solution that fully automates the generation of primary transcript, coding region, and protein sequences, analogous to those available for high-quality reference genomes. These features, coupled with the pipeline's cross-platform implementation, focus on automation, and an accessible, user-friendly interface, make UnigeneFinder a useful tool for many downstream sequence-based analyses in nonmodel organisms lacking a reference genome, including differential gene expression analysis, accurate ortholog identification, functional enrichments, and evolutionary analyses. UnigeneFinder also runs efficiently both on high-performance computing (HPC) systems and personal computers, further reducing barriers to use.</p>\",\"PeriodicalId\":20230,\"journal\":{\"name\":\"Plant Direct\",\"volume\":\"9 4\",\"pages\":\"e70056\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12012387/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Plant Direct\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1002/pld3.70056\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/4/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"PLANT SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Direct","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/pld3.70056","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"PLANT SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

对于大多数物种,转录组数据比基因组数据更容易获得。如果没有参考基因组,由于从头转录组组装的高度冗余,基因调用是繁琐和不准确的。为了在没有参考基因组的情况下简化和提高从头转录组组装的准确性,我们开发了UnigeneFinder。结合几种聚类方法,UnigeneFinder大大减少了原始转录组组装的冗余。这个管道提供了一个有效的解决方案,以膨胀的转录本数量的问题,实现了实际的潜在基因组的更接近的表示。与现有工具相比,UnigeneFinder在具有不同基因组复杂性的植物物种上表现相当或更好。UnigeneFinder是唯一可用的转录组冗余解决方案,完全自动化生成初级转录物,编码区和蛋白质序列,类似于高质量参考基因组。这些特点,再加上管道的跨平台实现、自动化和易于访问的用户友好界面,使UnigeneFinder成为许多基于下游序列分析的有用工具,用于缺乏参考基因组的非模式生物,包括差异基因表达分析、准确的同源识别、功能富集和进化分析。UnigeneFinder还可以在高性能计算(HPC)系统和个人电脑上高效运行,进一步减少了使用障碍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
UnigeneFinder: An Automated Pipeline for Gene Calling From Transcriptome Assemblies Without a Reference Genome.

For most species, transcriptome data are much more readily available than genome data. Without a reference genome, gene calling is cumbersome and inaccurate because of the high degree of redundancy in de novo transcriptome assemblies. To simplify and increase the accuracy of de novo transcriptome assembly in the absence of a reference genome, we developed UnigeneFinder. Combining several clustering methods, UnigeneFinder substantially reduces the redundancy typical of raw transcriptome assemblies. This pipeline offers an effective solution to the problem of inflated transcript numbers, achieving a closer representation of the actual underlying genome. UnigeneFinder performs comparably or better, compared with existing tools, on plant species with varying genome complexities. UnigeneFinder is the only available transcriptome redundancy solution that fully automates the generation of primary transcript, coding region, and protein sequences, analogous to those available for high-quality reference genomes. These features, coupled with the pipeline's cross-platform implementation, focus on automation, and an accessible, user-friendly interface, make UnigeneFinder a useful tool for many downstream sequence-based analyses in nonmodel organisms lacking a reference genome, including differential gene expression analysis, accurate ortholog identification, functional enrichments, and evolutionary analyses. UnigeneFinder also runs efficiently both on high-performance computing (HPC) systems and personal computers, further reducing barriers to use.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Plant Direct
Plant Direct Environmental Science-Ecology
CiteScore
5.00
自引率
3.30%
发文量
101
审稿时长
14 weeks
期刊介绍: Plant Direct is a monthly, sound science journal for the plant sciences that gives prompt and equal consideration to papers reporting work dealing with a variety of subjects. Topics include but are not limited to genetics, biochemistry, development, cell biology, biotic stress, abiotic stress, genomics, phenomics, bioinformatics, physiology, molecular biology, and evolution. A collaborative journal launched by the American Society of Plant Biologists, the Society for Experimental Biology and Wiley, Plant Direct publishes papers submitted directly to the journal as well as those referred from a select group of the societies’ journals.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信