Evaluating long-read assemblers to assemble several aphididae genomes.

IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Nicolaas F V Burger, Vittorio F Nicolis, Anna-Maria Botha
{"title":"Evaluating long-read assemblers to assemble several aphididae genomes.","authors":"Nicolaas F V Burger, Vittorio F Nicolis, Anna-Maria Botha","doi":"10.1093/bib/bbaf105","DOIUrl":null,"url":null,"abstract":"<p><p>Aphids are a speciose family of the Hemiptera compromising >5500 species. They have adapted to feed off multiple plant species and occur on every continent on Earth. Although economically devastating, very few aphid genomes have been sequenced and assembled, and those that have suffer low contiguity due to repeat-rich and AT-rich genomes. With third-generation sequencing becoming more affordable and approaching quality levels to that of second-generation sequencing, the ability to produce more contiguous aphid genome assemblies is becoming a reality. With a growing list of long-read assemblers becoming available, the choice of which assembly tool to use becomes more complicated. In this study, six recently released long-read assemblers (Canu, Flye, Hifiasm, Mecat2, Raven, and Wtdbg2) were evaluated on several quality and contiguity metrics after assembling four populations (or biotypes) of the same species (Russian wheat aphid, Diuraphis noxia) and two unrelated aphid species that have publicly available long-read sequences. All assemblers did not fare equally well between the different read sets, but, overall, the Hifiasm and Canu assemblers performed the best. Merging of the best assemblies for each read set was also performed using quickmerge, where, in some cases, it resulted in superior assemblies and, in others, introduced more errors. Ab initio gene calling between assemblies of the same read set also showed surprisingly less similarity than expected. Overall, the quality control pipeline followed during the assembly resulted in chromosome-level assemblies with minimal structural or quality artefacts.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 2","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11904405/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf105","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Aphids are a speciose family of the Hemiptera compromising >5500 species. They have adapted to feed off multiple plant species and occur on every continent on Earth. Although economically devastating, very few aphid genomes have been sequenced and assembled, and those that have suffer low contiguity due to repeat-rich and AT-rich genomes. With third-generation sequencing becoming more affordable and approaching quality levels to that of second-generation sequencing, the ability to produce more contiguous aphid genome assemblies is becoming a reality. With a growing list of long-read assemblers becoming available, the choice of which assembly tool to use becomes more complicated. In this study, six recently released long-read assemblers (Canu, Flye, Hifiasm, Mecat2, Raven, and Wtdbg2) were evaluated on several quality and contiguity metrics after assembling four populations (or biotypes) of the same species (Russian wheat aphid, Diuraphis noxia) and two unrelated aphid species that have publicly available long-read sequences. All assemblers did not fare equally well between the different read sets, but, overall, the Hifiasm and Canu assemblers performed the best. Merging of the best assemblies for each read set was also performed using quickmerge, where, in some cases, it resulted in superior assemblies and, in others, introduced more errors. Ab initio gene calling between assemblies of the same read set also showed surprisingly less similarity than expected. Overall, the quality control pipeline followed during the assembly resulted in chromosome-level assemblies with minimal structural or quality artefacts.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Briefings in bioinformatics
Briefings in bioinformatics 生物-生化研究方法
CiteScore
13.20
自引率
13.70%
发文量
549
审稿时长
6 months
期刊介绍: Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信