{"title":"Knowledge-enhanced and structure-enhanced representation learning for protein–ligand binding affinity prediction","authors":"Mei Li , Ye Cao , Xiaoguang Liu , Hua Ji","doi":"10.1016/j.patcog.2025.111701","DOIUrl":null,"url":null,"abstract":"<div><div>Protein–ligand binding affinity (PLA) prediction is a fundamental preliminary stage in drug discovery and development. Existing methods mainly focus on structure-free prediction of binding affinities and the investigation of structural PLA prediction is not fully explored yet. Spatial structures of protein–ligand complexes are critical in determining binding affinities. A few graph neural network (GNN) based methods model spatial structures of complexes with pairwise atomic distances within a cutoff, which provides insufficient spatial descriptions and limits their capabilities in distinguishing between certain molecules. In this paper, we propose a knowledge-enhanced and structure-enhanced representation learning method (KSM) for structural PLA prediction. The proposed KSM has a specially designed structure-based GNN (KSGNN) to learn complete representations for PLA prediction by combining sequence and structure information of complexes. Notably, KSGNN is capable of learning structure-aware representations via incorporating relative spatial information of distances and angles among atoms into the message passing. Additionally, we adopt an attentive pooling layer (APL) to further refine structural patterns in complexes. We compare KSM against 18 state-of-the-art baselines on two benchmarks. KSM outperforms its competitors with improvements of 0.0536 and 0.19 on the PDBbind core set and the CSAR-HiQ dataset, respectively, in terms of the metric of RMSE, demonstrating its superiority in binding affinity prediction.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111701"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325003619","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Protein–ligand binding affinity (PLA) prediction is a fundamental preliminary stage in drug discovery and development. Existing methods mainly focus on structure-free prediction of binding affinities and the investigation of structural PLA prediction is not fully explored yet. Spatial structures of protein–ligand complexes are critical in determining binding affinities. A few graph neural network (GNN) based methods model spatial structures of complexes with pairwise atomic distances within a cutoff, which provides insufficient spatial descriptions and limits their capabilities in distinguishing between certain molecules. In this paper, we propose a knowledge-enhanced and structure-enhanced representation learning method (KSM) for structural PLA prediction. The proposed KSM has a specially designed structure-based GNN (KSGNN) to learn complete representations for PLA prediction by combining sequence and structure information of complexes. Notably, KSGNN is capable of learning structure-aware representations via incorporating relative spatial information of distances and angles among atoms into the message passing. Additionally, we adopt an attentive pooling layer (APL) to further refine structural patterns in complexes. We compare KSM against 18 state-of-the-art baselines on two benchmarks. KSM outperforms its competitors with improvements of 0.0536 and 0.19 on the PDBbind core set and the CSAR-HiQ dataset, respectively, in terms of the metric of RMSE, demonstrating its superiority in binding affinity prediction.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.