Enhancing protein-ligand binding affinity prediction through sequential fusion of graph and convolutional neural networks

IF 3.4 3区化学 Q2 CHEMISTRY, MULTIDISCIPLINARY

Journal of Computational Chemistry Pub Date : 2024-09-02 DOI:10.1002/jcc.27499

Yimin Yang, Ruiqin Zhang, Zijing Lin

{"title":"Enhancing protein-ligand binding affinity prediction through sequential fusion of graph and convolutional neural networks","authors":"Yimin Yang, Ruiqin Zhang, Zijing Lin","doi":"10.1002/jcc.27499","DOIUrl":null,"url":null,"abstract":"<p>Predicting protein-ligand binding affinity is a crucial and challenging task in structure-based drug discovery. With the accumulation of complex structures and binding affinity data, various machine-learning scoring functions, particularly those based on deep learning, have been developed for this task, exhibiting superiority over their traditional counterparts. A fusion model sequentially connecting a graph neural network (GNN) and a convolutional neural network (CNN) to predict protein-ligand binding affinity is proposed in this work. In this model, the intermediate outputs of the GNN layers, as supplementary descriptors of atomic chemical environments at different levels, are concatenated with the input features of CNN. The model demonstrates a noticeable improvement in performance on CASF-2016 benchmark compared to its constituent CNN models. The generalization ability of the model is evaluated by setting a series of thresholds for ligand extended-connectivity fingerprint similarity or protein sequence similarity between the training and test sets. Masking experiment reveals that model can capture key interaction regions. Furthermore, the fusion model is applied to a virtual screening task for a novel target, PI5P4Kα. The fusion strategy significantly improves the ability of the constituent CNN model to identify active compounds. This work offers a novel approach to enhancing the accuracy of deep learning models in predicting binding affinity through fusion strategies.</p>","PeriodicalId":188,"journal":{"name":"Journal of Computational Chemistry","volume":"45 32","pages":"2929-2940"},"PeriodicalIF":3.4000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Chemistry","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jcc.27499","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Predicting protein-ligand binding affinity is a crucial and challenging task in structure-based drug discovery. With the accumulation of complex structures and binding affinity data, various machine-learning scoring functions, particularly those based on deep learning, have been developed for this task, exhibiting superiority over their traditional counterparts. A fusion model sequentially connecting a graph neural network (GNN) and a convolutional neural network (CNN) to predict protein-ligand binding affinity is proposed in this work. In this model, the intermediate outputs of the GNN layers, as supplementary descriptors of atomic chemical environments at different levels, are concatenated with the input features of CNN. The model demonstrates a noticeable improvement in performance on CASF-2016 benchmark compared to its constituent CNN models. The generalization ability of the model is evaluated by setting a series of thresholds for ligand extended-connectivity fingerprint similarity or protein sequence similarity between the training and test sets. Masking experiment reveals that model can capture key interaction regions. Furthermore, the fusion model is applied to a virtual screening task for a novel target, PI5P4Kα. The fusion strategy significantly improves the ability of the constituent CNN model to identify active compounds. This work offers a novel approach to enhancing the accuracy of deep learning models in predicting binding affinity through fusion strategies.

Abstract Image

查看原文本刊更多论文

通过图和卷积神经网络的连续融合增强蛋白质配体结合亲和力预测。

预测蛋白质与配体的结合亲和力是基于结构的药物发现中一项关键而又具有挑战性的任务。随着复杂结构和结合亲和力数据的积累，针对这一任务开发了各种机器学习评分函数，特别是基于深度学习的评分函数，表现出了优于传统评分函数的优势。本研究提出了一种将图神经网络（GNN）和卷积神经网络（CNN）依次连接的融合模型，用于预测蛋白质配体的结合亲和力。在该模型中，GNN 各层的中间输出作为不同层次原子化学环境的补充描述符，与 CNN 的输入特征相串联。与组成 CNN 的模型相比，该模型在 CASF-2016 基准测试中的性能有了明显提高。通过设置训练集和测试集之间配体扩展连接指纹相似性或蛋白质序列相似性的一系列阈值，评估了模型的泛化能力。屏蔽实验表明，该模型可以捕捉到关键的相互作用区域。此外，融合模型还被应用于新靶点 PI5P4Kα 的虚拟筛选任务。融合策略大大提高了组成 CNN 模型识别活性化合物的能力。这项工作提供了一种新方法，通过融合策略提高深度学习模型预测结合亲和力的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Computational Chemistry 化学-化学综合

CiteScore

6.60

自引率

3.30%

发文量

247

审稿时长

1.7 months

期刊介绍： This distinguished journal publishes articles concerned with all aspects of computational chemistry: analytical, biological, inorganic, organic, physical, and materials. The Journal of Computational Chemistry presents original research, contemporary developments in theory and methodology, and state-of-the-art applications. Computational areas that are featured in the journal include ab initio and semiempirical quantum mechanics, density functional theory, molecular mechanics, molecular dynamics, statistical mechanics, cheminformatics, biomolecular structure prediction, molecular design, and bioinformatics.