{"title":"NanoFilter: enhancing phasing performance by utilizing highly consistent INDELs and SNVs in nanopore sequencing.","authors":"Shanming Chen, Fan Nie, Jianxin Wang","doi":"10.1093/bioinformatics/btaf453","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Nanopore sequencing data offer longer reads compared to other technologies, which is beneficial for phasing and genome assembly. INDELs provide valuable haplotype information and have significant potential to improve phasing performance. However, accurately identifying INDELs with variant callers is challenging, and incorporating INDELs into phasing remains a complex task. To address these issues, we developed NanoFilter, a novel filtering strategy designed to filter out INDELs that contain wrong phasing information based on their consistency.</p><p><strong>Results: </strong>Our assessment using Nanopore R10 simplex data shows that filtering out low-consistency INDELs increases their precision from 88.3% to 98.8%, nearly matching the precision of SNVs. In the phasing results of Margin, incorporating these filtered INDELs leads to a 12.77% increase in N50 length and fewer switch errors. Furthermore, we found that SNVs filtered by NanoFilter will enhance assembly performance. When NanoFilter is integrated into the HapDup assembly pipeline, NanoFilter reduces the Hamming error rate and increases N50 length by 7.8%.</p><p><strong>Availability and implementation: </strong>NanoFilter is available at https://github.com/Chenshanming-repo/NanoFilter (DOI: 10.5281/zenodo.16777826) and HapDup-NanoFilter is available at https://github.com/Chenshanming-repo/HapDup-NanoFilter (DOI: 10.5281/zenodo.16777890).</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12448842/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf453","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: Nanopore sequencing data offer longer reads compared to other technologies, which is beneficial for phasing and genome assembly. INDELs provide valuable haplotype information and have significant potential to improve phasing performance. However, accurately identifying INDELs with variant callers is challenging, and incorporating INDELs into phasing remains a complex task. To address these issues, we developed NanoFilter, a novel filtering strategy designed to filter out INDELs that contain wrong phasing information based on their consistency.
Results: Our assessment using Nanopore R10 simplex data shows that filtering out low-consistency INDELs increases their precision from 88.3% to 98.8%, nearly matching the precision of SNVs. In the phasing results of Margin, incorporating these filtered INDELs leads to a 12.77% increase in N50 length and fewer switch errors. Furthermore, we found that SNVs filtered by NanoFilter will enhance assembly performance. When NanoFilter is integrated into the HapDup assembly pipeline, NanoFilter reduces the Hamming error rate and increases N50 length by 7.8%.
Availability and implementation: NanoFilter is available at https://github.com/Chenshanming-repo/NanoFilter (DOI: 10.5281/zenodo.16777826) and HapDup-NanoFilter is available at https://github.com/Chenshanming-repo/HapDup-NanoFilter (DOI: 10.5281/zenodo.16777890).