VenusMutHub：在小规模实验数据上对蛋白质突变效应预测因子进行系统评估

IF 14.7 1区医学 Q1 PHARMACOLOGY & PHARMACY

Acta Pharmaceutica Sinica. B Pub Date : 2025-05-01 DOI:10.1016/j.apsb.2025.03.028

Liang Zhang , Hua Pang , Chenghao Zhang , Song Li , Yang Tan , Fan Jiang , Mingchen Li , Yuanxi Yu , Ziyi Zhou , Banghao Wu , Bingxin Zhou , Hao Liu , Pan Tan , Liang Hong

{"title":"VenusMutHub：在小规模实验数据上对蛋白质突变效应预测因子进行系统评估","authors":"Liang Zhang , Hua Pang , Chenghao Zhang , Song Li , Yang Tan , Fan Jiang , Mingchen Li , Yuanxi Yu , Ziyi Zhou , Banghao Wu , Bingxin Zhou , Hao Liu , Pan Tan , Liang Hong","doi":"10.1016/j.apsb.2025.03.028","DOIUrl":null,"url":null,"abstract":"<div><div>In protein engineering, while computational models are increasingly used to predict mutation effects, their evaluations primarily rely on high-throughput deep mutational scanning (DMS) experiments that use surrogate readouts, which may not adequately capture the complex biochemical properties of interest. Many proteins and their functions cannot be assessed through high-throughput methods due to technical limitations or the nature of the desired properties, and this is particularly true for the real industrial application scenario. Therefore, the desired testing datasets, will be small-size (∼10–100) experimental data for each protein, and involve as many proteins as possible and as many properties as possible, which is, however, lacking. Here, we present VenusMutHub, a comprehensive benchmark study using 905 small-scale experimental datasets curated from published literature and public databases, spanning 527 proteins across diverse functional properties including stability, activity, binding affinity, and selectivity. These datasets feature direct biochemical measurements rather than surrogate readouts, providing a more rigorous assessment of model performance in predicting mutations that affect specific molecular functions. We evaluate 23 computational models across various methodological paradigms, such as sequence-based, structure-informed and evolutionary approaches. This benchmark provides practical guidance for selecting appropriate prediction methods in protein engineering applications where accurate prediction of specific functional properties is crucial.</div></div>","PeriodicalId":6906,"journal":{"name":"Acta Pharmaceutica Sinica. B","volume":"15 5","pages":"Pages 2454-2467"},"PeriodicalIF":14.7000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VenusMutHub: A systematic evaluation of protein mutation effect predictors on small-scale experimental data\",\"authors\":\"Liang Zhang , Hua Pang , Chenghao Zhang , Song Li , Yang Tan , Fan Jiang , Mingchen Li , Yuanxi Yu , Ziyi Zhou , Banghao Wu , Bingxin Zhou , Hao Liu , Pan Tan , Liang Hong\",\"doi\":\"10.1016/j.apsb.2025.03.028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In protein engineering, while computational models are increasingly used to predict mutation effects, their evaluations primarily rely on high-throughput deep mutational scanning (DMS) experiments that use surrogate readouts, which may not adequately capture the complex biochemical properties of interest. Many proteins and their functions cannot be assessed through high-throughput methods due to technical limitations or the nature of the desired properties, and this is particularly true for the real industrial application scenario. Therefore, the desired testing datasets, will be small-size (∼10–100) experimental data for each protein, and involve as many proteins as possible and as many properties as possible, which is, however, lacking. Here, we present VenusMutHub, a comprehensive benchmark study using 905 small-scale experimental datasets curated from published literature and public databases, spanning 527 proteins across diverse functional properties including stability, activity, binding affinity, and selectivity. These datasets feature direct biochemical measurements rather than surrogate readouts, providing a more rigorous assessment of model performance in predicting mutations that affect specific molecular functions. We evaluate 23 computational models across various methodological paradigms, such as sequence-based, structure-informed and evolutionary approaches. This benchmark provides practical guidance for selecting appropriate prediction methods in protein engineering applications where accurate prediction of specific functional properties is crucial.</div></div>\",\"PeriodicalId\":6906,\"journal\":{\"name\":\"Acta Pharmaceutica Sinica. B\",\"volume\":\"15 5\",\"pages\":\"Pages 2454-2467\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Pharmaceutica Sinica. B\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2211383525001650\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PHARMACOLOGY & PHARMACY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Pharmaceutica Sinica. B","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211383525001650","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}

引用次数: 0

摘要

在蛋白质工程中，虽然计算模型越来越多地用于预测突变效应，但它们的评估主要依赖于使用替代读数的高通量深度突变扫描（DMS）实验，这可能无法充分捕获感兴趣的复杂生化特性。由于技术限制或所需性质的性质，许多蛋白质及其功能无法通过高通量方法进行评估，这对于实际的工业应用场景尤其如此。因此，所需的测试数据集将是每个蛋白质的小尺寸（~ 10-100）实验数据，并涉及尽可能多的蛋白质和尽可能多的性质，这是目前所缺乏的。在这里，我们展示了VenusMutHub，这是一项综合基准研究，使用了905个小规模实验数据集，这些数据集来自已发表的文献和公共数据库，涵盖了527种具有不同功能特性的蛋白质，包括稳定性、活性、结合亲和力和选择性。这些数据集的特点是直接生化测量，而不是替代读数，在预测影响特定分子功能的突变时，提供了更严格的模型性能评估。我们评估了23种不同方法范式的计算模型，如基于序列的、结构信息的和进化方法。该基准为在蛋白质工程应用中选择适当的预测方法提供了实用的指导，其中准确预测特定的功能特性是至关重要的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

VenusMutHub: A systematic evaluation of protein mutation effect predictors on small-scale experimental data

In protein engineering, while computational models are increasingly used to predict mutation effects, their evaluations primarily rely on high-throughput deep mutational scanning (DMS) experiments that use surrogate readouts, which may not adequately capture the complex biochemical properties of interest. Many proteins and their functions cannot be assessed through high-throughput methods due to technical limitations or the nature of the desired properties, and this is particularly true for the real industrial application scenario. Therefore, the desired testing datasets, will be small-size (∼10–100) experimental data for each protein, and involve as many proteins as possible and as many properties as possible, which is, however, lacking. Here, we present VenusMutHub, a comprehensive benchmark study using 905 small-scale experimental datasets curated from published literature and public databases, spanning 527 proteins across diverse functional properties including stability, activity, binding affinity, and selectivity. These datasets feature direct biochemical measurements rather than surrogate readouts, providing a more rigorous assessment of model performance in predicting mutations that affect specific molecular functions. We evaluate 23 computational models across various methodological paradigms, such as sequence-based, structure-informed and evolutionary approaches. This benchmark provides practical guidance for selecting appropriate prediction methods in protein engineering applications where accurate prediction of specific functional properties is crucial.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Acta Pharmaceutica Sinica. B Pharmacology, Toxicology and Pharmaceutics-General Pharmacology, Toxicology and Pharmaceutics

CiteScore

22.40

自引率

5.50%

发文量

1051

审稿时长

19 weeks

期刊介绍： The Journal of the Institute of Materia Medica, Chinese Academy of Medical Sciences, and the Chinese Pharmaceutical Association oversees the peer review process for Acta Pharmaceutica Sinica. B (APSB). Published monthly in English, APSB is dedicated to disseminating significant original research articles, rapid communications, and high-quality reviews that highlight recent advances across various pharmaceutical sciences domains. These encompass pharmacology, pharmaceutics, medicinal chemistry, natural products, pharmacognosy, pharmaceutical analysis, and pharmacokinetics. A part of the Acta Pharmaceutica Sinica series, established in 1953 and indexed in prominent databases like Chemical Abstracts, Index Medicus, SciFinder Scholar, Biological Abstracts, International Pharmaceutical Abstracts, Cambridge Scientific Abstracts, and Current Bibliography on Science and Technology, APSB is sponsored by the Institute of Materia Medica, Chinese Academy of Medical Sciences, and the Chinese Pharmaceutical Association. Its production and hosting are facilitated by Elsevier B.V. This collaborative effort ensures APSB's commitment to delivering valuable contributions to the pharmaceutical sciences community.