Jimin Park,Daniel E Cook,Pi-Chuan Chang,Alexey Kolesnikov,Lucas Brambrink,Juan Carlos Mier,Joshua Gardner,Brandy McNulty,Samuel Sacco,Ayse G Keskus,Asher Bryant,Tanveer Ahmad,Jyoti Shetty,Yongmei Zhao,Bao Tran,Giuseppe Narzisi,Adrienne Helland,Byunggil Yoo,Irina Pushel,Lisa A Lansdon,Chengpeng Bi,Adam Walter,Margaret Gibson,Tomi Pastinen,Rebecca Reiman,Sharvari Mankame,T Rhyker Ranallo-Benavidez,Christine Brown,Nicolas Robine,Floris P Barthel,Midhat S Farooqi,Karen H Miga,Andrew Carroll,Mikhail Kolmogorov,Benedict Paten,Kishwar Shafin
{"title":"利用DeepSomatic精确的体细胞小变异发现多种测序技术。","authors":"Jimin Park,Daniel E Cook,Pi-Chuan Chang,Alexey Kolesnikov,Lucas Brambrink,Juan Carlos Mier,Joshua Gardner,Brandy McNulty,Samuel Sacco,Ayse G Keskus,Asher Bryant,Tanveer Ahmad,Jyoti Shetty,Yongmei Zhao,Bao Tran,Giuseppe Narzisi,Adrienne Helland,Byunggil Yoo,Irina Pushel,Lisa A Lansdon,Chengpeng Bi,Adam Walter,Margaret Gibson,Tomi Pastinen,Rebecca Reiman,Sharvari Mankame,T Rhyker Ranallo-Benavidez,Christine Brown,Nicolas Robine,Floris P Barthel,Midhat S Farooqi,Karen H Miga,Andrew Carroll,Mikhail Kolmogorov,Benedict Paten,Kishwar Shafin","doi":"10.1038/s41587-025-02839-x","DOIUrl":null,"url":null,"abstract":"Somatic variant detection is an integral part of cancer genomics analysis. While most methods have focused on short-read sequencing, long-read technologies offer potential advantages in repeat mapping and variant phasing. We present DeepSomatic, a deep-learning method for detecting somatic small nucleotide variations and insertions and deletions from both short-read and long-read data. The method has modes for whole-genome and whole-exome sequencing and can run on tumor-normal, tumor-only and formalin-fixed paraffin-embedded samples. To train DeepSomatic and help address the dearth of publicly available training and benchmarking data for somatic variant detection, we generated and make openly available the Cancer Standards Long-read Evaluation (CASTLE) dataset of six matched tumor-normal cell line pairs whole-genome sequenced with Illumina, PacBio HiFi and Oxford Nanopore Technologies, along with benchmark variant sets. Across samples, both cell line and patient-derived, and across short-read and long-read sequencing technologies, DeepSomatic consistently outperforms existing callers.","PeriodicalId":19084,"journal":{"name":"Nature biotechnology","volume":"13 1","pages":""},"PeriodicalIF":41.7000,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accurate somatic small variant discovery for multiple sequencing technologies with DeepSomatic.\",\"authors\":\"Jimin Park,Daniel E Cook,Pi-Chuan Chang,Alexey Kolesnikov,Lucas Brambrink,Juan Carlos Mier,Joshua Gardner,Brandy McNulty,Samuel Sacco,Ayse G Keskus,Asher Bryant,Tanveer Ahmad,Jyoti Shetty,Yongmei Zhao,Bao Tran,Giuseppe Narzisi,Adrienne Helland,Byunggil Yoo,Irina Pushel,Lisa A Lansdon,Chengpeng Bi,Adam Walter,Margaret Gibson,Tomi Pastinen,Rebecca Reiman,Sharvari Mankame,T Rhyker Ranallo-Benavidez,Christine Brown,Nicolas Robine,Floris P Barthel,Midhat S Farooqi,Karen H Miga,Andrew Carroll,Mikhail Kolmogorov,Benedict Paten,Kishwar Shafin\",\"doi\":\"10.1038/s41587-025-02839-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Somatic variant detection is an integral part of cancer genomics analysis. While most methods have focused on short-read sequencing, long-read technologies offer potential advantages in repeat mapping and variant phasing. We present DeepSomatic, a deep-learning method for detecting somatic small nucleotide variations and insertions and deletions from both short-read and long-read data. The method has modes for whole-genome and whole-exome sequencing and can run on tumor-normal, tumor-only and formalin-fixed paraffin-embedded samples. To train DeepSomatic and help address the dearth of publicly available training and benchmarking data for somatic variant detection, we generated and make openly available the Cancer Standards Long-read Evaluation (CASTLE) dataset of six matched tumor-normal cell line pairs whole-genome sequenced with Illumina, PacBio HiFi and Oxford Nanopore Technologies, along with benchmark variant sets. Across samples, both cell line and patient-derived, and across short-read and long-read sequencing technologies, DeepSomatic consistently outperforms existing callers.\",\"PeriodicalId\":19084,\"journal\":{\"name\":\"Nature biotechnology\",\"volume\":\"13 1\",\"pages\":\"\"},\"PeriodicalIF\":41.7000,\"publicationDate\":\"2025-10-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature biotechnology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1038/s41587-025-02839-x\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature biotechnology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1038/s41587-025-02839-x","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
Accurate somatic small variant discovery for multiple sequencing technologies with DeepSomatic.
Somatic variant detection is an integral part of cancer genomics analysis. While most methods have focused on short-read sequencing, long-read technologies offer potential advantages in repeat mapping and variant phasing. We present DeepSomatic, a deep-learning method for detecting somatic small nucleotide variations and insertions and deletions from both short-read and long-read data. The method has modes for whole-genome and whole-exome sequencing and can run on tumor-normal, tumor-only and formalin-fixed paraffin-embedded samples. To train DeepSomatic and help address the dearth of publicly available training and benchmarking data for somatic variant detection, we generated and make openly available the Cancer Standards Long-read Evaluation (CASTLE) dataset of six matched tumor-normal cell line pairs whole-genome sequenced with Illumina, PacBio HiFi and Oxford Nanopore Technologies, along with benchmark variant sets. Across samples, both cell line and patient-derived, and across short-read and long-read sequencing technologies, DeepSomatic consistently outperforms existing callers.
期刊介绍:
Nature Biotechnology is a monthly journal that focuses on the science and business of biotechnology. It covers a wide range of topics including technology/methodology advancements in the biological, biomedical, agricultural, and environmental sciences. The journal also explores the commercial, political, ethical, legal, and societal aspects of this research.
The journal serves researchers by providing peer-reviewed research papers in the field of biotechnology. It also serves the business community by delivering news about research developments. This approach ensures that both the scientific and business communities are well-informed and able to stay up-to-date on the latest advancements and opportunities in the field.
Some key areas of interest in which the journal actively seeks research papers include molecular engineering of nucleic acids and proteins, molecular therapy, large-scale biology, computational biology, regenerative medicine, imaging technology, analytical biotechnology, applied immunology, food and agricultural biotechnology, and environmental biotechnology.
In summary, Nature Biotechnology is a comprehensive journal that covers both the scientific and business aspects of biotechnology. It strives to provide researchers with valuable research papers and news while also delivering important scientific advancements to the business community.