结合亲和力预测的距离加注意力

IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Julia Rahman, M. A. Hakim Newton, Mohammed Eunus Ali, Abdul Sattar
{"title":"结合亲和力预测的距离加注意力","authors":"Julia Rahman,&nbsp;M. A. Hakim Newton,&nbsp;Mohammed Eunus Ali,&nbsp;Abdul Sattar","doi":"10.1186/s13321-024-00844-x","DOIUrl":null,"url":null,"abstract":"<div><p>Protein-ligand binding affinity plays a pivotal role in drug development, particularly in identifying potential ligands for target disease-related proteins. Accurate affinity predictions can significantly reduce both the time and cost involved in drug development. However, highly precise affinity prediction remains a research challenge. A key to improve affinity prediction is to capture interactions between proteins and ligands effectively. Existing deep-learning-based computational approaches use 3D grids, 4D tensors, molecular graphs, or proximity-based adjacency matrices, which are either resource-intensive or do not directly represent potential interactions. In this paper, we propose atomic-level distance features and attention mechanisms to capture better specific protein-ligand interactions based on donor-acceptor relations, hydrophobicity, and <span>\\(\\pi \\)</span>-stacking atoms. We argue that distances encompass both short-range direct and long-range indirect interaction effects while attention mechanisms capture levels of interaction effects. On the very well-known CASF-2016 dataset, our proposed method, named Distance plus Attention for Affinity Prediction (DAAP), significantly outperforms existing methods by achieving Correlation Coefficient (R) 0.909, Root Mean Squared Error (RMSE) 0.987, Mean Absolute Error (MAE) 0.745, Standard Deviation (SD) 0.988, and Concordance Index (CI) 0.876. The proposed method also shows substantial improvement, around 2% to 37%, on five other benchmark datasets. The program and data are publicly available on the website https://gitlab.com/mahnewton/daap.</p><p><b>Scientific Contribution Statement</b></p><p>This study innovatively introduces\ndistance-based features to predict protein-ligand binding affinity, capitalizing on\nunique molecular interactions. Furthermore, the incorporation of protein sequence\nfeatures of specific residues enhances the model’s proficiency in capturing intricate\nbinding patterns. The predictive capabilities are further strengthened through the\nuse of a deep learning architecture with attention mechanisms, and an ensemble\napproach, averaging the outputs of five models, is implemented to ensure robust\nand reliable predictions.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1000,"publicationDate":"2024-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00844-x","citationCount":"0","resultStr":"{\"title\":\"Distance plus attention for binding affinity prediction\",\"authors\":\"Julia Rahman,&nbsp;M. A. Hakim Newton,&nbsp;Mohammed Eunus Ali,&nbsp;Abdul Sattar\",\"doi\":\"10.1186/s13321-024-00844-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Protein-ligand binding affinity plays a pivotal role in drug development, particularly in identifying potential ligands for target disease-related proteins. Accurate affinity predictions can significantly reduce both the time and cost involved in drug development. However, highly precise affinity prediction remains a research challenge. A key to improve affinity prediction is to capture interactions between proteins and ligands effectively. Existing deep-learning-based computational approaches use 3D grids, 4D tensors, molecular graphs, or proximity-based adjacency matrices, which are either resource-intensive or do not directly represent potential interactions. In this paper, we propose atomic-level distance features and attention mechanisms to capture better specific protein-ligand interactions based on donor-acceptor relations, hydrophobicity, and <span>\\\\(\\\\pi \\\\)</span>-stacking atoms. We argue that distances encompass both short-range direct and long-range indirect interaction effects while attention mechanisms capture levels of interaction effects. On the very well-known CASF-2016 dataset, our proposed method, named Distance plus Attention for Affinity Prediction (DAAP), significantly outperforms existing methods by achieving Correlation Coefficient (R) 0.909, Root Mean Squared Error (RMSE) 0.987, Mean Absolute Error (MAE) 0.745, Standard Deviation (SD) 0.988, and Concordance Index (CI) 0.876. The proposed method also shows substantial improvement, around 2% to 37%, on five other benchmark datasets. The program and data are publicly available on the website https://gitlab.com/mahnewton/daap.</p><p><b>Scientific Contribution Statement</b></p><p>This study innovatively introduces\\ndistance-based features to predict protein-ligand binding affinity, capitalizing on\\nunique molecular interactions. Furthermore, the incorporation of protein sequence\\nfeatures of specific residues enhances the model’s proficiency in capturing intricate\\nbinding patterns. The predictive capabilities are further strengthened through the\\nuse of a deep learning architecture with attention mechanisms, and an ensemble\\napproach, averaging the outputs of five models, is implemented to ensure robust\\nand reliable predictions.</p></div>\",\"PeriodicalId\":617,\"journal\":{\"name\":\"Journal of Cheminformatics\",\"volume\":\"16 1\",\"pages\":\"\"},\"PeriodicalIF\":7.1000,\"publicationDate\":\"2024-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00844-x\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Cheminformatics\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://link.springer.com/article/10.1186/s13321-024-00844-x\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-024-00844-x","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

蛋白质与配体的结合亲和力在药物开发中起着举足轻重的作用,尤其是在确定目标疾病相关蛋白质的潜在配体方面。准确的亲和力预测可以大大减少药物开发所需的时间和成本。然而,高精度的亲和力预测仍然是一项研究挑战。改进亲和力预测的关键在于有效捕捉蛋白质与配体之间的相互作用。现有的基于深度学习的计算方法使用三维网格、四维张量、分子图或基于邻近度的邻接矩阵,这些方法要么资源密集,要么不能直接表示潜在的相互作用。在本文中,我们根据供体-受体关系、疏水性和堆叠原子,提出了原子级距离特征和关注机制,以更好地捕捉特定的蛋白质-配体相互作用。我们认为,距离既包括短程直接相互作用效应,也包括长程间接相互作用效应,而关注机制则捕捉相互作用效应的水平。在众所周知的 CASF-2016 数据集上,我们提出的名为 "亲和力预测的距离加注意力"(DAAP)的方法取得了相关系数(R)0.909、均方根误差(RMSE)0.987、平均绝对误差(MAE)0.745、标准偏差(SD)0.988 和一致性指数(CI)0.876 的成绩,显著优于现有方法。在其他五个基准数据集上,所提出的方法也有大幅改进,改进幅度在 2% 到 37% 之间。程序和数据可在网站 https://gitlab.com/mahnewton/daap 上公开获取。科学贡献声明 本研究利用独特的分子相互作用,创新性地引入了基于距离的特征来预测蛋白质与配体的结合亲和力。此外,结合特定残基的蛋白质序列特征增强了模型捕捉错综复杂的结合模式的能力。通过使用具有注意机制的深度学习架构,预测能力得到了进一步加强,同时还采用了一种集合方法,对五个模型的输出进行平均,以确保预测结果的稳健性和可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Distance plus attention for binding affinity prediction

Protein-ligand binding affinity plays a pivotal role in drug development, particularly in identifying potential ligands for target disease-related proteins. Accurate affinity predictions can significantly reduce both the time and cost involved in drug development. However, highly precise affinity prediction remains a research challenge. A key to improve affinity prediction is to capture interactions between proteins and ligands effectively. Existing deep-learning-based computational approaches use 3D grids, 4D tensors, molecular graphs, or proximity-based adjacency matrices, which are either resource-intensive or do not directly represent potential interactions. In this paper, we propose atomic-level distance features and attention mechanisms to capture better specific protein-ligand interactions based on donor-acceptor relations, hydrophobicity, and \(\pi \)-stacking atoms. We argue that distances encompass both short-range direct and long-range indirect interaction effects while attention mechanisms capture levels of interaction effects. On the very well-known CASF-2016 dataset, our proposed method, named Distance plus Attention for Affinity Prediction (DAAP), significantly outperforms existing methods by achieving Correlation Coefficient (R) 0.909, Root Mean Squared Error (RMSE) 0.987, Mean Absolute Error (MAE) 0.745, Standard Deviation (SD) 0.988, and Concordance Index (CI) 0.876. The proposed method also shows substantial improvement, around 2% to 37%, on five other benchmark datasets. The program and data are publicly available on the website https://gitlab.com/mahnewton/daap.

Scientific Contribution Statement

This study innovatively introduces distance-based features to predict protein-ligand binding affinity, capitalizing on unique molecular interactions. Furthermore, the incorporation of protein sequence features of specific residues enhances the model’s proficiency in capturing intricate binding patterns. The predictive capabilities are further strengthened through the use of a deep learning architecture with attention mechanisms, and an ensemble approach, averaging the outputs of five models, is implemented to ensure robust and reliable predictions.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Cheminformatics
Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS
CiteScore
14.10
自引率
7.00%
发文量
82
审稿时长
3 months
期刊介绍: Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信