Fangting Tao, Jinyuan Sun, Pengyue Gao, George Fu Gao, Bian Wu
{"title":"Reliable prediction of protein-protein binding affinity changes upon mutations with Pythia-PPI.","authors":"Fangting Tao, Jinyuan Sun, Pengyue Gao, George Fu Gao, Bian Wu","doi":"10.1093/nsr/nwaf231","DOIUrl":null,"url":null,"abstract":"<p><p>Protein-protein interactions (PPIs) are essential for numerous biological functions and predicting binding affinity changes caused by mutations is crucial for understanding the impact of genetic variation and advancing protein engineering. Although machine-learning-based methods show promise in improving prediction accuracy, limited experimental data remain a significant bottleneck. In this study, we employed multitask learning and self-distillation to overcome the data limitation and improve the accuracy of protein-protein binding affinity prediction. By incorporating a mutation stability prediction task, our model achieved state-of-the-art accuracy on the SKEMPI dataset and was subsequently used to predict binding affinity changes for millions of mutations, generating an expanded dataset for self-distillation. Compared with prevalent methods, Pythia-PPI increased the Pearson's correlation between predictions and experimental data from 0.6447 to 0.7850 on the SKEMPI dataset and from 0.3654 to 0.6050 on the viral-receptor dataset. Experimental validation further confirmed its ability to identify high-affinity mutations on the CB6 antibody in complex with the severe acute respiratory syndrome coronavirus 2 prototype receptor binding domain, with the best single-point mutant among the top 10 predictions showing a 2-fold increase in binding affinity. These findings demonstrate that Pythia-PPI is a valuable tool for analysing the fitness landscape of PPIs. A web server for Pythia-PPI is available at https://pythiappi.wulab.xyz for easy access.</p>","PeriodicalId":18842,"journal":{"name":"National Science Review","volume":"12 6","pages":"nwaf231"},"PeriodicalIF":16.3000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12199698/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"National Science Review","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1093/nsr/nwaf231","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Protein-protein interactions (PPIs) are essential for numerous biological functions and predicting binding affinity changes caused by mutations is crucial for understanding the impact of genetic variation and advancing protein engineering. Although machine-learning-based methods show promise in improving prediction accuracy, limited experimental data remain a significant bottleneck. In this study, we employed multitask learning and self-distillation to overcome the data limitation and improve the accuracy of protein-protein binding affinity prediction. By incorporating a mutation stability prediction task, our model achieved state-of-the-art accuracy on the SKEMPI dataset and was subsequently used to predict binding affinity changes for millions of mutations, generating an expanded dataset for self-distillation. Compared with prevalent methods, Pythia-PPI increased the Pearson's correlation between predictions and experimental data from 0.6447 to 0.7850 on the SKEMPI dataset and from 0.3654 to 0.6050 on the viral-receptor dataset. Experimental validation further confirmed its ability to identify high-affinity mutations on the CB6 antibody in complex with the severe acute respiratory syndrome coronavirus 2 prototype receptor binding domain, with the best single-point mutant among the top 10 predictions showing a 2-fold increase in binding affinity. These findings demonstrate that Pythia-PPI is a valuable tool for analysing the fitness landscape of PPIs. A web server for Pythia-PPI is available at https://pythiappi.wulab.xyz for easy access.
期刊介绍:
National Science Review (NSR; ISSN abbreviation: Natl. Sci. Rev.) is an English-language peer-reviewed multidisciplinary open-access scientific journal published by Oxford University Press under the auspices of the Chinese Academy of Sciences.According to Journal Citation Reports, its 2021 impact factor was 23.178.
National Science Review publishes both review articles and perspectives as well as original research in the form of brief communications and research articles.