Liang Zhang , Hua Pang , Chenghao Zhang , Song Li , Yang Tan , Fan Jiang , Mingchen Li , Yuanxi Yu , Ziyi Zhou , Banghao Wu , Bingxin Zhou , Hao Liu , Pan Tan , Liang Hong
{"title":"VenusMutHub:在小规模实验数据上对蛋白质突变效应预测因子进行系统评估","authors":"Liang Zhang , Hua Pang , Chenghao Zhang , Song Li , Yang Tan , Fan Jiang , Mingchen Li , Yuanxi Yu , Ziyi Zhou , Banghao Wu , Bingxin Zhou , Hao Liu , Pan Tan , Liang Hong","doi":"10.1016/j.apsb.2025.03.028","DOIUrl":null,"url":null,"abstract":"<div><div>In protein engineering, while computational models are increasingly used to predict mutation effects, their evaluations primarily rely on high-throughput deep mutational scanning (DMS) experiments that use surrogate readouts, which may not adequately capture the complex biochemical properties of interest. Many proteins and their functions cannot be assessed through high-throughput methods due to technical limitations or the nature of the desired properties, and this is particularly true for the real industrial application scenario. Therefore, the desired testing datasets, will be small-size (∼10–100) experimental data for each protein, and involve as many proteins as possible and as many properties as possible, which is, however, lacking. Here, we present VenusMutHub, a comprehensive benchmark study using 905 small-scale experimental datasets curated from published literature and public databases, spanning 527 proteins across diverse functional properties including stability, activity, binding affinity, and selectivity. These datasets feature direct biochemical measurements rather than surrogate readouts, providing a more rigorous assessment of model performance in predicting mutations that affect specific molecular functions. We evaluate 23 computational models across various methodological paradigms, such as sequence-based, structure-informed and evolutionary approaches. This benchmark provides practical guidance for selecting appropriate prediction methods in protein engineering applications where accurate prediction of specific functional properties is crucial.</div></div>","PeriodicalId":6906,"journal":{"name":"Acta Pharmaceutica Sinica. B","volume":"15 5","pages":"Pages 2454-2467"},"PeriodicalIF":14.7000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VenusMutHub: A systematic evaluation of protein mutation effect predictors on small-scale experimental data\",\"authors\":\"Liang Zhang , Hua Pang , Chenghao Zhang , Song Li , Yang Tan , Fan Jiang , Mingchen Li , Yuanxi Yu , Ziyi Zhou , Banghao Wu , Bingxin Zhou , Hao Liu , Pan Tan , Liang Hong\",\"doi\":\"10.1016/j.apsb.2025.03.028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In protein engineering, while computational models are increasingly used to predict mutation effects, their evaluations primarily rely on high-throughput deep mutational scanning (DMS) experiments that use surrogate readouts, which may not adequately capture the complex biochemical properties of interest. Many proteins and their functions cannot be assessed through high-throughput methods due to technical limitations or the nature of the desired properties, and this is particularly true for the real industrial application scenario. Therefore, the desired testing datasets, will be small-size (∼10–100) experimental data for each protein, and involve as many proteins as possible and as many properties as possible, which is, however, lacking. Here, we present VenusMutHub, a comprehensive benchmark study using 905 small-scale experimental datasets curated from published literature and public databases, spanning 527 proteins across diverse functional properties including stability, activity, binding affinity, and selectivity. These datasets feature direct biochemical measurements rather than surrogate readouts, providing a more rigorous assessment of model performance in predicting mutations that affect specific molecular functions. We evaluate 23 computational models across various methodological paradigms, such as sequence-based, structure-informed and evolutionary approaches. This benchmark provides practical guidance for selecting appropriate prediction methods in protein engineering applications where accurate prediction of specific functional properties is crucial.</div></div>\",\"PeriodicalId\":6906,\"journal\":{\"name\":\"Acta Pharmaceutica Sinica. B\",\"volume\":\"15 5\",\"pages\":\"Pages 2454-2467\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Pharmaceutica Sinica. B\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2211383525001650\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PHARMACOLOGY & PHARMACY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Pharmaceutica Sinica. B","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211383525001650","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
VenusMutHub: A systematic evaluation of protein mutation effect predictors on small-scale experimental data
In protein engineering, while computational models are increasingly used to predict mutation effects, their evaluations primarily rely on high-throughput deep mutational scanning (DMS) experiments that use surrogate readouts, which may not adequately capture the complex biochemical properties of interest. Many proteins and their functions cannot be assessed through high-throughput methods due to technical limitations or the nature of the desired properties, and this is particularly true for the real industrial application scenario. Therefore, the desired testing datasets, will be small-size (∼10–100) experimental data for each protein, and involve as many proteins as possible and as many properties as possible, which is, however, lacking. Here, we present VenusMutHub, a comprehensive benchmark study using 905 small-scale experimental datasets curated from published literature and public databases, spanning 527 proteins across diverse functional properties including stability, activity, binding affinity, and selectivity. These datasets feature direct biochemical measurements rather than surrogate readouts, providing a more rigorous assessment of model performance in predicting mutations that affect specific molecular functions. We evaluate 23 computational models across various methodological paradigms, such as sequence-based, structure-informed and evolutionary approaches. This benchmark provides practical guidance for selecting appropriate prediction methods in protein engineering applications where accurate prediction of specific functional properties is crucial.
Acta Pharmaceutica Sinica. BPharmacology, Toxicology and Pharmaceutics-General Pharmacology, Toxicology and Pharmaceutics
CiteScore
22.40
自引率
5.50%
发文量
1051
审稿时长
19 weeks
期刊介绍:
The Journal of the Institute of Materia Medica, Chinese Academy of Medical Sciences, and the Chinese Pharmaceutical Association oversees the peer review process for Acta Pharmaceutica Sinica. B (APSB).
Published monthly in English, APSB is dedicated to disseminating significant original research articles, rapid communications, and high-quality reviews that highlight recent advances across various pharmaceutical sciences domains. These encompass pharmacology, pharmaceutics, medicinal chemistry, natural products, pharmacognosy, pharmaceutical analysis, and pharmacokinetics.
A part of the Acta Pharmaceutica Sinica series, established in 1953 and indexed in prominent databases like Chemical Abstracts, Index Medicus, SciFinder Scholar, Biological Abstracts, International Pharmaceutical Abstracts, Cambridge Scientific Abstracts, and Current Bibliography on Science and Technology, APSB is sponsored by the Institute of Materia Medica, Chinese Academy of Medical Sciences, and the Chinese Pharmaceutical Association. Its production and hosting are facilitated by Elsevier B.V. This collaborative effort ensures APSB's commitment to delivering valuable contributions to the pharmaceutical sciences community.