Application and clinical utility assessment of natural language processing-based software for copy-number variants interpretation.

IF 7.5 2区 医学 Q1 MEDICINE, RESEARCH & EXPERIMENTAL
Songchang Chen, Chang Liu, Xiaorui Luan, Yuling Wang, Yuexin Xu, Yunshuang Li, Fenjiao Zhang, Weihui Shi, Xuanyou Zhou, Chenming Xu
{"title":"Application and clinical utility assessment of natural language processing-based software for copy-number variants interpretation.","authors":"Songchang Chen, Chang Liu, Xiaorui Luan, Yuling Wang, Yuexin Xu, Yunshuang Li, Fenjiao Zhang, Weihui Shi, Xuanyou Zhou, Chenming Xu","doi":"10.1186/s12967-025-07063-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Manual interpretation of copy-number variant (CNV) according to the guideline published by the American College of Medical Genetics and Genomics (ACMG) and the Clinical Genome Resources (ClinGen) in 2020 is labor-intensive and time-consuming. The application of natural language processing (NLP)-based software like CNVisi can reduce the burden of CNV interpretation, but its clinical utility needs to be further evaluated.</p><p><strong>Methods: </strong>We firstly used 1000 CNVs which had been previously manually classified to assess the performance of CNVisi. To assess the clinical utility of CNVisi, we collected 5861 CNVs from 2443 next-generation sequencing (NGS)-based CNV sequencing (CNV-seq) samples. The CNVs were first classified by CNVisi and then reviewed by genetic experts. After removing duplicates, the remaining 3384 CNVs were used for assessment of classification consistency, and 154 CNVs that met the reporting rules were finally selected for further analysis.</p><p><strong>Results: </strong>The overall accuracy of CNVisi in distinguishing pCNVs (Pathogenic or Likely Pathogenic CNVs) was 97.7% (977/1000) in preliminary assessment of performance. And the accuracy of CNVisi in assessment of clinical utility was 99.6% (3370/3384). Among 154 CNVs that met clinical reporting rules, 23 CNVs were classified with disagreement between CNVisi and genetic experts. The inconsistency in classification is mainly caused by the overlap between CNV and low-penetrance regions, and the difference in scoring of evidence related to the literature. According to the reporting rules, total CNVs were classified with a high consistency of 98.6% (5781/5861) between genetic experts and CNVisi, and the CNV-seq results of 96.9% (2367/2443) samples could be accurately and efficiently interpreted by CNVisi. Furthermore, CNVisi was superior to previous tools for CNV interpretation and classification, and showed excellent clinical utility.</p><p><strong>Conclusions: </strong>Applying CNV interpretation software such as CNVisi with clinical utility can reduce the burden of genetic experts and improve the efficiency of CNV interpretation.</p>","PeriodicalId":17458,"journal":{"name":"Journal of Translational Medicine","volume":"23 1","pages":"1052"},"PeriodicalIF":7.5000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12495639/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Translational Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12967-025-07063-4","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Manual interpretation of copy-number variant (CNV) according to the guideline published by the American College of Medical Genetics and Genomics (ACMG) and the Clinical Genome Resources (ClinGen) in 2020 is labor-intensive and time-consuming. The application of natural language processing (NLP)-based software like CNVisi can reduce the burden of CNV interpretation, but its clinical utility needs to be further evaluated.

Methods: We firstly used 1000 CNVs which had been previously manually classified to assess the performance of CNVisi. To assess the clinical utility of CNVisi, we collected 5861 CNVs from 2443 next-generation sequencing (NGS)-based CNV sequencing (CNV-seq) samples. The CNVs were first classified by CNVisi and then reviewed by genetic experts. After removing duplicates, the remaining 3384 CNVs were used for assessment of classification consistency, and 154 CNVs that met the reporting rules were finally selected for further analysis.

Results: The overall accuracy of CNVisi in distinguishing pCNVs (Pathogenic or Likely Pathogenic CNVs) was 97.7% (977/1000) in preliminary assessment of performance. And the accuracy of CNVisi in assessment of clinical utility was 99.6% (3370/3384). Among 154 CNVs that met clinical reporting rules, 23 CNVs were classified with disagreement between CNVisi and genetic experts. The inconsistency in classification is mainly caused by the overlap between CNV and low-penetrance regions, and the difference in scoring of evidence related to the literature. According to the reporting rules, total CNVs were classified with a high consistency of 98.6% (5781/5861) between genetic experts and CNVisi, and the CNV-seq results of 96.9% (2367/2443) samples could be accurately and efficiently interpreted by CNVisi. Furthermore, CNVisi was superior to previous tools for CNV interpretation and classification, and showed excellent clinical utility.

Conclusions: Applying CNV interpretation software such as CNVisi with clinical utility can reduce the burden of genetic experts and improve the efficiency of CNV interpretation.

基于自然语言处理的拷贝数变异解释软件的应用及临床效用评估。
背景:根据美国医学遗传学与基因组学学会(ACMG)和临床基因组资源(ClinGen) 2020年发布的指南,人工解释拷贝数变异(CNV)是一项劳动密集型和耗时的工作。CNVisi等基于自然语言处理(NLP)的软件的应用可以减轻CNV判读的负担,但其临床效用有待进一步评估。方法:我们首先使用1000个人工分类的cnv来评估CNVisi的性能。为了评估CNVisi的临床应用,我们从2443个基于下一代测序(NGS)的CNV测序(CNV-seq)样本中收集了5861个CNV。这些CNVs首先由CNVisi进行分类,然后由遗传专家进行审查。剔除重复后,剩余3384份cnv用于分类一致性评估,最终选择符合报告规则的154份cnv进行进一步分析。结果:CNVisi鉴别pCNVs(致病性或可能致病性CNVs)的总体准确率为97.7%(977/1000)。CNVisi评估临床效用的准确率为99.6%(3370/3384)。在符合临床报告规则的154个cnv中,有23个CNVisi与遗传专家意见不一致。分类上的不一致主要是由于CNV和低外显率区域的重叠,以及文献相关证据评分的差异。根据报告规则,遗传专家与CNVisi对总cnv的分类一致性为98.6%(5781/5861),96.9%(2367/2443)样本的CNV-seq结果能够被CNVisi准确高效地解释。此外,CNVisi优于以前的CNV解释和分类工具,并显示出出色的临床实用性。结论:应用CNVisi等具有临床实用性的CNV判读软件,可以减轻遗传专家的负担,提高CNV判读效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Translational Medicine
Journal of Translational Medicine 医学-医学:研究与实验
CiteScore
10.00
自引率
1.40%
发文量
537
审稿时长
1 months
期刊介绍: The Journal of Translational Medicine is an open-access journal that publishes articles focusing on information derived from human experimentation to enhance communication between basic and clinical science. It covers all areas of translational medicine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信