Updated benchmarking of variant effect predictors using deep mutational scanning.

IF 8.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Molecular Systems Biology Pub Date : 2023-08-08 Epub Date: 2023-06-13 DOI:10.15252/msb.202211474
Benjamin J Livesey, Joseph A Marsh
{"title":"Updated benchmarking of variant effect predictors using deep mutational scanning.","authors":"Benjamin J Livesey, Joseph A Marsh","doi":"10.15252/msb.202211474","DOIUrl":null,"url":null,"abstract":"<p><p>The assessment of variant effect predictor (VEP) performance is fraught with biases introduced by benchmarking against clinical observations. In this study, building on our previous work, we use independently generated measurements of protein function from deep mutational scanning (DMS) experiments for 26 human proteins to benchmark 55 different VEPs, while introducing minimal data circularity. Many top-performing VEPs are unsupervised methods including EVE, DeepSequence and ESM-1v, a protein language model that ranked first overall. However, the strong performance of recent supervised VEPs, in particular VARITY, shows that developers are taking data circularity and bias issues seriously. We also assess the performance of DMS and unsupervised VEPs for discriminating between known pathogenic and putatively benign missense variants. Our findings are mixed, demonstrating that some DMS datasets perform exceptionally at variant classification, while others are poor. Notably, we observe a striking correlation between VEP agreement with DMS data and performance in identifying clinically relevant variants, strongly supporting the validity of our rankings and the utility of DMS for independent benchmarking.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":"19 8","pages":"e11474"},"PeriodicalIF":8.5000,"publicationDate":"2023-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10407742/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Systems Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.15252/msb.202211474","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/6/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The assessment of variant effect predictor (VEP) performance is fraught with biases introduced by benchmarking against clinical observations. In this study, building on our previous work, we use independently generated measurements of protein function from deep mutational scanning (DMS) experiments for 26 human proteins to benchmark 55 different VEPs, while introducing minimal data circularity. Many top-performing VEPs are unsupervised methods including EVE, DeepSequence and ESM-1v, a protein language model that ranked first overall. However, the strong performance of recent supervised VEPs, in particular VARITY, shows that developers are taking data circularity and bias issues seriously. We also assess the performance of DMS and unsupervised VEPs for discriminating between known pathogenic and putatively benign missense variants. Our findings are mixed, demonstrating that some DMS datasets perform exceptionally at variant classification, while others are poor. Notably, we observe a striking correlation between VEP agreement with DMS data and performance in identifying clinically relevant variants, strongly supporting the validity of our rankings and the utility of DMS for independent benchmarking.

利用深度突变扫描更新变异效应预测器的基准。
对变异效应预测因子(VEP)性能的评估充满了以临床观察结果为基准所带来的偏差。在本研究中,我们在之前工作的基础上,利用从 26 种人类蛋白质的深度突变扫描(DMS)实验中独立生成的蛋白质功能测量结果,对 55 种不同的 VEP 进行了基准测试,同时将数据循环性降至最低。许多表现优异的 VEP 都是无监督方法,包括 EVE、DeepSequence 和 ESM-1v,后者是一种蛋白质语言模型,综合排名第一。不过,近期有监督 VEP(尤其是 VARITY)的强劲表现表明,开发人员正在认真对待数据循环性和偏差问题。我们还评估了 DMS 和无监督 VEP 在区分已知致病性和假定良性错义变异方面的性能。我们的研究结果喜忧参半,一些 DMS 数据集在变异分类方面表现优异,而另一些数据集则表现不佳。值得注意的是,我们观察到 VEP 与 DMS 数据的一致性与识别临床相关变异的性能之间存在显著的相关性,这有力地证明了我们的排名的有效性以及 DMS 在独立基准测试中的实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Molecular Systems Biology
Molecular Systems Biology 生物-生化与分子生物学
CiteScore
18.50
自引率
1.00%
发文量
62
审稿时长
6-12 weeks
期刊介绍: Systems biology is a field that aims to understand complex biological systems by studying their components and how they interact. It is an integrative discipline that seeks to explain the properties and behavior of these systems. Molecular Systems Biology is a scholarly journal that publishes top-notch research in the areas of systems biology, synthetic biology, and systems medicine. It is an open access journal, meaning that its content is freely available to readers, and it is peer-reviewed to ensure the quality of the published work.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信