Benchmarking of germline copy number variant callers from whole genome sequencing data for clinical applications.

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Bioinformatics advances Pub Date : 2025-04-10 eCollection Date: 2025-01-01 DOI:10.1093/bioadv/vbaf071
Francisco M De La Vega, Sean A Irvine, Pavana Anur, Kelly Potts, Lewis Kraft, Raul Torres, Peter Kang, Sean Truong, Yeonghun Lee, Shunhua Han, Vitor Onuchic, James Han
{"title":"Benchmarking of germline copy number variant callers from whole genome sequencing data for clinical applications.","authors":"Francisco M De La Vega, Sean A Irvine, Pavana Anur, Kelly Potts, Lewis Kraft, Raul Torres, Peter Kang, Sean Truong, Yeonghun Lee, Shunhua Han, Vitor Onuchic, James Han","doi":"10.1093/bioadv/vbaf071","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Whole-genome sequencing (WGS) is increasingly preferred for clinical applications due to its comprehensive coverage, effectiveness in detecting copy number variants (CNVs), and declining costs. However, systematic evaluations of WGS CNV callers tailored to germline clinical testing-where high sensitivity and confirmation of reported CNVs are essential-remain necessary. Clinical reporting typically emphasizes CNVs affecting coding regions over precise breakpoint detection. This study benchmarks several short-read WGS CNV detection tools using reference cell lines to inform their clinical use.</p><p><strong>Results: </strong>While tools vary in sensitivity (7%-83%) and precision (1%-76%), few meet the sensitivity needed for clinical testing. Callers generally perform better for deletions (up to 88% sensitivity) than duplications (up to 47% sensitivity), with poor detection of duplications under 5 kb. Notably, for CNVs in genes commonly included in clinical panels, significantly improved sensitivity and precision were observed when benchmarking against 25 cell lines with known CNVs. DRAGEN v4.2 high-sensitivity CNV calls, post-processed with custom filters, achieved 100% sensitivity and 77% precision on the optimized gene panel after excluding recurring artifacts. This level of performance may support clinical use with orthogonal confirmation of reportable CNVs, pending validation on laboratory-specific samples.</p><p><strong>Availability and implementation: </strong>The data underlying this article are available in the European Nucleo-tide Archive under project accession PRJEB87628.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf071"},"PeriodicalIF":2.4000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12005901/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbaf071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Whole-genome sequencing (WGS) is increasingly preferred for clinical applications due to its comprehensive coverage, effectiveness in detecting copy number variants (CNVs), and declining costs. However, systematic evaluations of WGS CNV callers tailored to germline clinical testing-where high sensitivity and confirmation of reported CNVs are essential-remain necessary. Clinical reporting typically emphasizes CNVs affecting coding regions over precise breakpoint detection. This study benchmarks several short-read WGS CNV detection tools using reference cell lines to inform their clinical use.

Results: While tools vary in sensitivity (7%-83%) and precision (1%-76%), few meet the sensitivity needed for clinical testing. Callers generally perform better for deletions (up to 88% sensitivity) than duplications (up to 47% sensitivity), with poor detection of duplications under 5 kb. Notably, for CNVs in genes commonly included in clinical panels, significantly improved sensitivity and precision were observed when benchmarking against 25 cell lines with known CNVs. DRAGEN v4.2 high-sensitivity CNV calls, post-processed with custom filters, achieved 100% sensitivity and 77% precision on the optimized gene panel after excluding recurring artifacts. This level of performance may support clinical use with orthogonal confirmation of reportable CNVs, pending validation on laboratory-specific samples.

Availability and implementation: The data underlying this article are available in the European Nucleo-tide Archive under project accession PRJEB87628.

基于全基因组测序数据的种系拷贝数变异呼叫者对标研究及其临床应用。
动机:全基因组测序(WGS)由于其广泛的覆盖范围、检测拷贝数变异(CNVs)的有效性以及成本的下降,越来越受到临床应用的青睐。然而,针对生殖系临床测试的WGS CNV呼叫者的系统评估仍然是必要的,在这种测试中,报告的CNV的高灵敏度和确认是必不可少的。临床报告通常强调影响编码区的CNVs,而不是精确的断点检测。本研究使用参考细胞系对几种短读WGS CNV检测工具进行基准测试,以告知其临床应用。结果:虽然工具的灵敏度(7% ~ 83%)和精度(1% ~ 76%)各不相同,但很少有工具能满足临床试验所需的灵敏度。调用者通常对删除(高达88%的灵敏度)比重复(高达47%的灵敏度)表现得更好,对5 kb以下的重复的检测效果较差。值得注意的是,对于临床小组中通常包含的基因中的CNVs,当对25个已知CNVs的细胞系进行基准测试时,观察到灵敏度和精度显着提高。DRAGEN v4.2高灵敏度CNV呼叫,用自定义过滤器后处理,在排除反复出现的伪影后,在优化的基因面板上实现了100%的灵敏度和77%的精度。这一性能水平可能支持临床应用,对可报告的CNVs进行正交确认,等待实验室特定样品的验证。可获得性和实施:本文的基础数据可在欧洲核苷酸档案中获得,项目编号为PRJEB87628。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.60
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信