{"title":"colorSV: Long-range Somatic Structural Variation Calling from Matched Tumor-normal Co-assembly Graphs.","authors":"Megan K Le, Qian Qin, Heng Li","doi":"10.1093/gpbjnl/qzaf082","DOIUrl":null,"url":null,"abstract":"<p><p>The accurate identification of somatic structural variants (SVs) is important for understanding the basis and evolution of cancerous tumor growth. Though long-read sequencing has facilitated the development of more accurate SV calling methods, existing somatic SV callers still struggle with achieving simultaneously high precision and high recall. In this work, we present colorSV (COassembly-based LOng-Range SV caller), a long-read-based method for calling long-range SVs by examining the local topology of joint assembly graphs from matched tumor-normal samples. colorSV is the first somatic SV calling method that uses a co-assembly approach, as well as the first SV caller that identifies variants by examining characteristics of the assembly graph itself. We demonstrate near-perfect precision and sensitivity for calling translocations on the COLO829 cell line, outperforming four existing somatic SV callers (Severus, Sniffles2, nanomonsv, and SAVANA) in both metrics. We also evaluated colorSV for calling translocations on the HCC1395 cell line, finding that our method achieved a good balance between sensitivity and precision (where the sensitivity was outperformed by Severus and SAVANA, and the precision was only outperformed by nanomonsv). Our work establishes a novel joint assembly-based strategy for characterizing long-range somatic variation, which could be further expanded or modified for the identification of SVs of different types and sizes. colorSV is available at https://github.com/mktle/colorSV.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics, proteomics & bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/gpbjnl/qzaf082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The accurate identification of somatic structural variants (SVs) is important for understanding the basis and evolution of cancerous tumor growth. Though long-read sequencing has facilitated the development of more accurate SV calling methods, existing somatic SV callers still struggle with achieving simultaneously high precision and high recall. In this work, we present colorSV (COassembly-based LOng-Range SV caller), a long-read-based method for calling long-range SVs by examining the local topology of joint assembly graphs from matched tumor-normal samples. colorSV is the first somatic SV calling method that uses a co-assembly approach, as well as the first SV caller that identifies variants by examining characteristics of the assembly graph itself. We demonstrate near-perfect precision and sensitivity for calling translocations on the COLO829 cell line, outperforming four existing somatic SV callers (Severus, Sniffles2, nanomonsv, and SAVANA) in both metrics. We also evaluated colorSV for calling translocations on the HCC1395 cell line, finding that our method achieved a good balance between sensitivity and precision (where the sensitivity was outperformed by Severus and SAVANA, and the precision was only outperformed by nanomonsv). Our work establishes a novel joint assembly-based strategy for characterizing long-range somatic variation, which could be further expanded or modified for the identification of SVs of different types and sizes. colorSV is available at https://github.com/mktle/colorSV.