{"title":"Joint processing of long- and short-read sequencing data with deep learning improves variant calling.","authors":"Gennaro Gambardella","doi":"10.1016/j.crmeth.2025.101107","DOIUrl":null,"url":null,"abstract":"<p><p>Despite the complementary strengths of short- and long-read sequencing approaches, variant-calling methods still rely on a single data type. In this study, we collected and harmonized Nanopore datasets of the seven healthy individuals in the GIAB project across three independent consortia. By leveraging these harmonized Nanopore data, we explore the benefits of using a hybrid DeepVariant model to jointly process Illumina and Nanopore data for germline variant detection. We show that a shallow hybrid long-short sequencing approach can match or surpass the germline variant detection accuracy of state-of-the-art single-technology methods, potentially reducing overall sequencing costs and enabling the detection of large germline structural variations. These findings hold great promise for molecular diagnostics in clinical settings, particularly for rare genetic disease screenings.</p>","PeriodicalId":29773,"journal":{"name":"Cell Reports Methods","volume":" ","pages":"101107"},"PeriodicalIF":4.5000,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12296420/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell Reports Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.crmeth.2025.101107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/15 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Despite the complementary strengths of short- and long-read sequencing approaches, variant-calling methods still rely on a single data type. In this study, we collected and harmonized Nanopore datasets of the seven healthy individuals in the GIAB project across three independent consortia. By leveraging these harmonized Nanopore data, we explore the benefits of using a hybrid DeepVariant model to jointly process Illumina and Nanopore data for germline variant detection. We show that a shallow hybrid long-short sequencing approach can match or surpass the germline variant detection accuracy of state-of-the-art single-technology methods, potentially reducing overall sequencing costs and enabling the detection of large germline structural variations. These findings hold great promise for molecular diagnostics in clinical settings, particularly for rare genetic disease screenings.