{"title":"EquiRank: Improved protein-protein interface quality estimation using protein language-model-informed equivariant graph neural networks","authors":"Md Hossain Shuvo , Debswapna Bhattacharya","doi":"10.1016/j.csbj.2024.12.015","DOIUrl":null,"url":null,"abstract":"<div><div>Quality estimation of the predicted interaction interface of protein complex structural models is not only important for complex model evaluation and selection but also useful for protein-protein docking. Despite recent progress fueled by symmetry-aware deep learning architectures and pretrained protein language models (pLMs), existing methods for estimating protein complex quality have yet to fully exploit the collective potentials of these advances for accurate estimation of protein-protein interface. Here we present EquiRank, an improved protein-protein interface quality estimation method by leveraging the strength of a symmetry-aware E(3) equivariant deep graph neural network (EGNN) and integrating pLM embeddings from the pretrained ESM-2 model. Our method estimates the quality of the protein-protein interface through an effective graph-based representation of interacting residue pairs, incorporating a diverse set of features, including ESM-2 embeddings, and then by learning the representation using symmetry-aware EGNNs. Our experimental results demonstrate improved ranking performance on diverse datasets over existing latest protein complex quality estimation methods including the top-performing CASP15 protein complex quality estimation method VoroIF_GNN and the self-assessment module of AlphaFold-Multimer repurposed for protein complex scoring and across different performance evaluation metrics. Additionally, our ablation studies demonstrate the contributions of both pLMs and the equivariant nature of EGNN for improved protein-protein interface quality estimation performance. EquiRank is freely available at <span><span>https://github.com/mhshuvo1/EquiRank</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"Pages 160-170"},"PeriodicalIF":4.4000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11755013/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational and structural biotechnology journal","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2001037024004380","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Quality estimation of the predicted interaction interface of protein complex structural models is not only important for complex model evaluation and selection but also useful for protein-protein docking. Despite recent progress fueled by symmetry-aware deep learning architectures and pretrained protein language models (pLMs), existing methods for estimating protein complex quality have yet to fully exploit the collective potentials of these advances for accurate estimation of protein-protein interface. Here we present EquiRank, an improved protein-protein interface quality estimation method by leveraging the strength of a symmetry-aware E(3) equivariant deep graph neural network (EGNN) and integrating pLM embeddings from the pretrained ESM-2 model. Our method estimates the quality of the protein-protein interface through an effective graph-based representation of interacting residue pairs, incorporating a diverse set of features, including ESM-2 embeddings, and then by learning the representation using symmetry-aware EGNNs. Our experimental results demonstrate improved ranking performance on diverse datasets over existing latest protein complex quality estimation methods including the top-performing CASP15 protein complex quality estimation method VoroIF_GNN and the self-assessment module of AlphaFold-Multimer repurposed for protein complex scoring and across different performance evaluation metrics. Additionally, our ablation studies demonstrate the contributions of both pLMs and the equivariant nature of EGNN for improved protein-protein interface quality estimation performance. EquiRank is freely available at https://github.com/mhshuvo1/EquiRank.
期刊介绍:
Computational and Structural Biotechnology Journal (CSBJ) is an online gold open access journal publishing research articles and reviews after full peer review. All articles are published, without barriers to access, immediately upon acceptance. The journal places a strong emphasis on functional and mechanistic understanding of how molecular components in a biological process work together through the application of computational methods. Structural data may provide such insights, but they are not a pre-requisite for publication in the journal. Specific areas of interest include, but are not limited to:
Structure and function of proteins, nucleic acids and other macromolecules
Structure and function of multi-component complexes
Protein folding, processing and degradation
Enzymology
Computational and structural studies of plant systems
Microbial Informatics
Genomics
Proteomics
Metabolomics
Algorithms and Hypothesis in Bioinformatics
Mathematical and Theoretical Biology
Computational Chemistry and Drug Discovery
Microscopy and Molecular Imaging
Nanotechnology
Systems and Synthetic Biology