纳米孔测序自适应采样工具的全面基准测试

IF 10.1 1区生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Genome Biology Pub Date : 2025-09-17 DOI:10.1186/s13059-025-03729-w

Lang Yang, Yanfeng Lin, Peihan Li, Kaiying Wang, Jinhui Li, Yuqi Liu, Xiaochen Bo, Ming Ni, Peng Li, Hongbin Song

{"title":"纳米孔测序自适应采样工具的全面基准测试","authors":"Lang Yang, Yanfeng Lin, Peihan Li, Kaiying Wang, Jinhui Li, Yuqi Liu, Xiaochen Bo, Ming Ni, Peng Li, Hongbin Song","doi":"10.1186/s13059-025-03729-w","DOIUrl":null,"url":null,"abstract":"Adaptive sampling is an emerging technology to enrich target reads while depleting unwanted reads during real-time nanopore sequencing. The application of different algorithms has spawned various tools for the determination of read rejection. However, an evaluation in conjunction with identifying the optimal enrichment performance for a specific task has yet to be conducted. This study aimed to evaluate the performance of six widely used tools for nanopore adaptive sampling. Three distinct types of tasks were selected for testing, including the intraspecies enrichment of COSMIC genes, the interspecies enrichment of Saccharomyces cerevisiae, and the depletion of human host DNA. All the tools show increases in coverage depths of targets varying from 1.50- to 4.86-fold. The combination of Guppy for base calling and minimap2 for read alignment emerged as the optimal read classification strategy with the highest accuracy. MinKNOW, Readfish, and BOSS-RUNS using this strategy show generally excellent enrichment or depletion performance. The deep learning method utilizing raw signals demonstrates higher accuracy and quicker read ejection compared to the conventional signal-based approach, also achieving top-class performance in host depletion. Our benchmarking study conducted a thorough comparison of current tools on various adaptive sampling occasions. The nucleotide-alignment-based approach is capable of handling diverse target references with broad application. The tools employing this strategy, especially MinKNOW, could be considered as a prior option for most adaptive sampling scenarios. The deep learning technique utilizing raw signals demonstrates remarkable classification efficiency and accuracy, warranting greater emphasis and exploration in future software development endeavors.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"36 1","pages":""},"PeriodicalIF":10.1000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A comprehensive benchmarking of adaptive sampling tools for nanopore sequencing\",\"authors\":\"Lang Yang, Yanfeng Lin, Peihan Li, Kaiying Wang, Jinhui Li, Yuqi Liu, Xiaochen Bo, Ming Ni, Peng Li, Hongbin Song\",\"doi\":\"10.1186/s13059-025-03729-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Adaptive sampling is an emerging technology to enrich target reads while depleting unwanted reads during real-time nanopore sequencing. The application of different algorithms has spawned various tools for the determination of read rejection. However, an evaluation in conjunction with identifying the optimal enrichment performance for a specific task has yet to be conducted. This study aimed to evaluate the performance of six widely used tools for nanopore adaptive sampling. Three distinct types of tasks were selected for testing, including the intraspecies enrichment of COSMIC genes, the interspecies enrichment of Saccharomyces cerevisiae, and the depletion of human host DNA. All the tools show increases in coverage depths of targets varying from 1.50- to 4.86-fold. The combination of Guppy for base calling and minimap2 for read alignment emerged as the optimal read classification strategy with the highest accuracy. MinKNOW, Readfish, and BOSS-RUNS using this strategy show generally excellent enrichment or depletion performance. The deep learning method utilizing raw signals demonstrates higher accuracy and quicker read ejection compared to the conventional signal-based approach, also achieving top-class performance in host depletion. Our benchmarking study conducted a thorough comparison of current tools on various adaptive sampling occasions. The nucleotide-alignment-based approach is capable of handling diverse target references with broad application. The tools employing this strategy, especially MinKNOW, could be considered as a prior option for most adaptive sampling scenarios. The deep learning technique utilizing raw signals demonstrates remarkable classification efficiency and accuracy, warranting greater emphasis and exploration in future software development endeavors.\",\"PeriodicalId\":12611,\"journal\":{\"name\":\"Genome Biology\",\"volume\":\"36 1\",\"pages\":\"\"},\"PeriodicalIF\":10.1000,\"publicationDate\":\"2025-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genome Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13059-025-03729-w\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13059-025-03729-w","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

自适应采样是一种新兴的技术，可以在实时纳米孔测序过程中丰富目标reads，同时减少不需要的reads。不同算法的应用产生了各种确定读拒绝的工具。但是，还需要进行一项评价，以确定特定任务的最佳富集性能。本研究旨在评估六种广泛使用的纳米孔自适应采样工具的性能。选择三种不同类型的任务进行测试，包括COSMIC基因的种内富集、酿酒酵母的种间富集和人类宿主DNA的消耗。所有工具都显示目标的覆盖深度增加了1.50到4.86倍。基于Guppy的碱基调用和基于minimap2的读对齐组合是最优的、准确率最高的读分类策略。MinKNOW， Readfish和BOSS-RUNS使用这种策略通常表现出出色的富集或耗尽性能。与传统的基于信号的方法相比，利用原始信号的深度学习方法具有更高的准确性和更快的读取弹射速度，在主机耗尽方面也实现了一流的性能。我们的基准研究在各种自适应采样场合对当前工具进行了彻底的比较。基于核苷酸比对的方法能够处理多种靶标参考，具有广泛的应用前景。采用这种策略的工具，特别是MinKNOW，可以被认为是大多数自适应采样场景的优先选择。利用原始信号的深度学习技术展示了显著的分类效率和准确性，值得在未来的软件开发工作中给予更多的重视和探索。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A comprehensive benchmarking of adaptive sampling tools for nanopore sequencing

Adaptive sampling is an emerging technology to enrich target reads while depleting unwanted reads during real-time nanopore sequencing. The application of different algorithms has spawned various tools for the determination of read rejection. However, an evaluation in conjunction with identifying the optimal enrichment performance for a specific task has yet to be conducted. This study aimed to evaluate the performance of six widely used tools for nanopore adaptive sampling. Three distinct types of tasks were selected for testing, including the intraspecies enrichment of COSMIC genes, the interspecies enrichment of Saccharomyces cerevisiae, and the depletion of human host DNA. All the tools show increases in coverage depths of targets varying from 1.50- to 4.86-fold. The combination of Guppy for base calling and minimap2 for read alignment emerged as the optimal read classification strategy with the highest accuracy. MinKNOW, Readfish, and BOSS-RUNS using this strategy show generally excellent enrichment or depletion performance. The deep learning method utilizing raw signals demonstrates higher accuracy and quicker read ejection compared to the conventional signal-based approach, also achieving top-class performance in host depletion. Our benchmarking study conducted a thorough comparison of current tools on various adaptive sampling occasions. The nucleotide-alignment-based approach is capable of handling diverse target references with broad application. The tools employing this strategy, especially MinKNOW, could be considered as a prior option for most adaptive sampling scenarios. The deep learning technique utilizing raw signals demonstrates remarkable classification efficiency and accuracy, warranting greater emphasis and exploration in future software development endeavors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Genome Biology Biochemistry, Genetics and Molecular Biology-Genetics

CiteScore

21.00

自引率

3.30%

发文量

241

审稿时长

2 months

期刊介绍： Genome Biology stands as a premier platform for exceptional research across all domains of biology and biomedicine, explored through a genomic and post-genomic lens. With an impressive impact factor of 12.3 (2022),* the journal secures its position as the 3rd-ranked research journal in the Genetics and Heredity category and the 2nd-ranked research journal in the Biotechnology and Applied Microbiology category by Thomson Reuters. Notably, Genome Biology holds the distinction of being the highest-ranked open-access journal in this category. Our dedicated team of highly trained in-house Editors collaborates closely with our esteemed Editorial Board of international experts, ensuring the journal remains on the forefront of scientific advances and community standards. Regular engagement with researchers at conferences and institute visits underscores our commitment to staying abreast of the latest developments in the field.