利用深度突变扫描更新变异效应预测器的基准。

IF 8.5 1区生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY

Molecular Systems Biology Pub Date : 2023-08-08 Epub Date: 2023-06-13 DOI:10.15252/msb.202211474

Benjamin J Livesey, Joseph A Marsh

{"title":"利用深度突变扫描更新变异效应预测器的基准。","authors":"Benjamin J Livesey, Joseph A Marsh","doi":"10.15252/msb.202211474","DOIUrl":null,"url":null,"abstract":"The assessment of variant effect predictor (VEP) performance is fraught with biases introduced by benchmarking against clinical observations. In this study, building on our previous work, we use independently generated measurements of protein function from deep mutational scanning (DMS) experiments for 26 human proteins to benchmark 55 different VEPs, while introducing minimal data circularity. Many top-performing VEPs are unsupervised methods including EVE, DeepSequence and ESM-1v, a protein language model that ranked first overall. However, the strong performance of recent supervised VEPs, in particular VARITY, shows that developers are taking data circularity and bias issues seriously. We also assess the performance of DMS and unsupervised VEPs for discriminating between known pathogenic and putatively benign missense variants. Our findings are mixed, demonstrating that some DMS datasets perform exceptionally at variant classification, while others are poor. Notably, we observe a striking correlation between VEP agreement with DMS data and performance in identifying clinically relevant variants, strongly supporting the validity of our rankings and the utility of DMS for independent benchmarking.","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":"19 8","pages":"e11474"},"PeriodicalIF":8.5000,"publicationDate":"2023-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10407742/pdf/","citationCount":"0","resultStr":"{\"title\":\"Updated benchmarking of variant effect predictors using deep mutational scanning.\",\"authors\":\"Benjamin J Livesey, Joseph A Marsh\",\"doi\":\"10.15252/msb.202211474\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The assessment of variant effect predictor (VEP) performance is fraught with biases introduced by benchmarking against clinical observations. In this study, building on our previous work, we use independently generated measurements of protein function from deep mutational scanning (DMS) experiments for 26 human proteins to benchmark 55 different VEPs, while introducing minimal data circularity. Many top-performing VEPs are unsupervised methods including EVE, DeepSequence and ESM-1v, a protein language model that ranked first overall. However, the strong performance of recent supervised VEPs, in particular VARITY, shows that developers are taking data circularity and bias issues seriously. We also assess the performance of DMS and unsupervised VEPs for discriminating between known pathogenic and putatively benign missense variants. Our findings are mixed, demonstrating that some DMS datasets perform exceptionally at variant classification, while others are poor. Notably, we observe a striking correlation between VEP agreement with DMS data and performance in identifying clinically relevant variants, strongly supporting the validity of our rankings and the utility of DMS for independent benchmarking.\",\"PeriodicalId\":18906,\"journal\":{\"name\":\"Molecular Systems Biology\",\"volume\":\"19 8\",\"pages\":\"e11474\"},\"PeriodicalIF\":8.5000,\"publicationDate\":\"2023-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10407742/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Systems Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.15252/msb.202211474\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/6/13 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Systems Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.15252/msb.202211474","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/6/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

对变异效应预测因子（VEP）性能的评估充满了以临床观察结果为基准所带来的偏差。在本研究中，我们在之前工作的基础上，利用从 26 种人类蛋白质的深度突变扫描（DMS）实验中独立生成的蛋白质功能测量结果，对 55 种不同的 VEP 进行了基准测试，同时将数据循环性降至最低。许多表现优异的 VEP 都是无监督方法，包括 EVE、DeepSequence 和 ESM-1v，后者是一种蛋白质语言模型，综合排名第一。不过，近期有监督 VEP（尤其是 VARITY）的强劲表现表明，开发人员正在认真对待数据循环性和偏差问题。我们还评估了 DMS 和无监督 VEP 在区分已知致病性和假定良性错义变异方面的性能。我们的研究结果喜忧参半，一些 DMS 数据集在变异分类方面表现优异，而另一些数据集则表现不佳。值得注意的是，我们观察到 VEP 与 DMS 数据的一致性与识别临床相关变异的性能之间存在显著的相关性，这有力地证明了我们的排名的有效性以及 DMS 在独立基准测试中的实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Updated benchmarking of variant effect predictors using deep mutational scanning.

The assessment of variant effect predictor (VEP) performance is fraught with biases introduced by benchmarking against clinical observations. In this study, building on our previous work, we use independently generated measurements of protein function from deep mutational scanning (DMS) experiments for 26 human proteins to benchmark 55 different VEPs, while introducing minimal data circularity. Many top-performing VEPs are unsupervised methods including EVE, DeepSequence and ESM-1v, a protein language model that ranked first overall. However, the strong performance of recent supervised VEPs, in particular VARITY, shows that developers are taking data circularity and bias issues seriously. We also assess the performance of DMS and unsupervised VEPs for discriminating between known pathogenic and putatively benign missense variants. Our findings are mixed, demonstrating that some DMS datasets perform exceptionally at variant classification, while others are poor. Notably, we observe a striking correlation between VEP agreement with DMS data and performance in identifying clinically relevant variants, strongly supporting the validity of our rankings and the utility of DMS for independent benchmarking.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Molecular Systems Biology 生物-生化与分子生物学

CiteScore

18.50

自引率

1.00%

发文量

审稿时长

6-12 weeks

期刊介绍： Systems biology is a field that aims to understand complex biological systems by studying their components and how they interact. It is an integrative discipline that seeks to explain the properties and behavior of these systems. Molecular Systems Biology is a scholarly journal that publishes top-notch research in the areas of systems biology, synthetic biology, and systems medicine. It is an open access journal, meaning that its content is freely available to readers, and it is peer-reviewed to ensure the quality of the published work.