{"title":"Pre-Assembly NGS Correction of ONT Reads Achieves HiFi-Level Assembly Quality.","authors":"Evgeniy Mozheiko, Heng Yi, Anzhi Lu, Heitung Kong, Yong Hou, Yan Zhou, Hui Gao","doi":"10.1139/gen-2024-0132","DOIUrl":null,"url":null,"abstract":"<p><p>Recently developed hybrid assemblies can achieve Telomere-to-Telomere (T2T) completeness of some chromosomes. However, such approaches involve sequencing a large volume of both Pacific Biosciences high-fidelity (HiFi) and Oxford Nanopore Technologies (ONT) sequencing reads. Along with this, third-generation sequencing techniques are rapidly advancing, increasing the available length and accuracy. To reduce the final cost of genome assembly, here we investigated the possibility of assembly from low-coverage samples and with only ONT corrected by Next-Generation Sequencing (NGS) sequencing reads. We demonstrated that haploid ONT-based assembly approaches corrected by NGS can achieve performance metrics comparable to more expensive hybrid approaches based on HiFi sequencing. We investigated the assembly of different chromosomes and the low-coverage performance of state-of-the-art hybrid assembly tools, including Verkko and Hifiasm, as well as ONT-based assemblers such as Shasta and Flye. We found that even with one-contig T2T assembly Verkko and Hifiasm still have numerous misassemblies within centromere. Therefore, we recommend using a combination of regular R9 or simplex R10 ONT reads and accurate NGS reads for assembly without aiming for T2T completeness. Additionally, we rigorously evaluated the performance of MGI, Illumina, and stLFR NGS technologies across various aspects of hybrid genome assembly, including pre-assembly correction, haplotype phasing, and polishing.</p>","PeriodicalId":12809,"journal":{"name":"Genome","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1139/gen-2024-0132","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Recently developed hybrid assemblies can achieve Telomere-to-Telomere (T2T) completeness of some chromosomes. However, such approaches involve sequencing a large volume of both Pacific Biosciences high-fidelity (HiFi) and Oxford Nanopore Technologies (ONT) sequencing reads. Along with this, third-generation sequencing techniques are rapidly advancing, increasing the available length and accuracy. To reduce the final cost of genome assembly, here we investigated the possibility of assembly from low-coverage samples and with only ONT corrected by Next-Generation Sequencing (NGS) sequencing reads. We demonstrated that haploid ONT-based assembly approaches corrected by NGS can achieve performance metrics comparable to more expensive hybrid approaches based on HiFi sequencing. We investigated the assembly of different chromosomes and the low-coverage performance of state-of-the-art hybrid assembly tools, including Verkko and Hifiasm, as well as ONT-based assemblers such as Shasta and Flye. We found that even with one-contig T2T assembly Verkko and Hifiasm still have numerous misassemblies within centromere. Therefore, we recommend using a combination of regular R9 or simplex R10 ONT reads and accurate NGS reads for assembly without aiming for T2T completeness. Additionally, we rigorously evaluated the performance of MGI, Illumina, and stLFR NGS technologies across various aspects of hybrid genome assembly, including pre-assembly correction, haplotype phasing, and polishing.
期刊介绍:
Genome is a monthly journal, established in 1959, that publishes original research articles, reviews, mini-reviews, current opinions, and commentaries. Areas of interest include general genetics and genomics, cytogenetics, molecular and evolutionary genetics, developmental genetics, population genetics, phylogenomics, molecular identification, as well as emerging areas such as ecological, comparative, and functional genomics.