DataDTA：一个用于药物靶标结合亲和力预测的多特征和双重相互作用聚集框架。

IF 5.4 3区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Bioinformatics Pub Date : 2023-09-02 DOI:10.1093/bioinformatics/btad560

Yan Zhu, Lingling Zhao, Naifeng Wen, Junjie Wang, Chunyu Wang

{"title":"DataDTA：一个用于药物靶标结合亲和力预测的多特征和双重相互作用聚集框架。","authors":"Yan Zhu, Lingling Zhao, Naifeng Wen, Junjie Wang, Chunyu Wang","doi":"10.1093/bioinformatics/btad560","DOIUrl":null,"url":null,"abstract":"Motivation: Accurate prediction of drug-target binding affinity (DTA) is crucial for drug discovery. The increase in the publication of large-scale DTA datasets enables the development of various computational methods for DTA prediction. Numerous deep learning-based methods have been proposed to predict affinities, some of which only utilize original sequence information or complex structures, but the effective combination of various information and protein-binding pockets have not been fully mined. Therefore, a new method that integrates available key information is urgently needed to predict DTA and accelerate the drug discovery process.Results: In this study, we propose a novel deep learning-based predictor termed DataDTA to estimate the affinities of drug-target pairs. DataDTA utilizes descriptors of predicted pockets and sequences of proteins, as well as low-dimensional molecular features and SMILES strings of compounds as inputs. Specifically, the pockets were predicted from the three-dimensional structure of proteins and their descriptors were extracted as the partial input features for DTA prediction. The molecular representation of compounds based on algebraic graph features was collected to supplement the input information of targets. Furthermore, to ensure effective learning of multiscale interaction features, a dual-interaction aggregation neural network strategy was developed. DataDTA was compared with state-of-the-art methods on different datasets, and the results showed that DataDTA is a reliable prediction tool for affinities estimation. Specifically, the concordance index (CI) of DataDTA is 0.806 and the Pearson correlation coefficient (R) value is 0.814 on the test dataset, which is higher than other methods.Availability and implementation: The codes and datasets of DataDTA are available at https://github.com/YanZhu06/DataDTA.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516524/pdf/","citationCount":"0","resultStr":"{\"title\":\"DataDTA: a multi-feature and dual-interaction aggregation framework for drug-target binding affinity prediction.\",\"authors\":\"Yan Zhu, Lingling Zhao, Naifeng Wen, Junjie Wang, Chunyu Wang\",\"doi\":\"10.1093/bioinformatics/btad560\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Motivation: Accurate prediction of drug-target binding affinity (DTA) is crucial for drug discovery. The increase in the publication of large-scale DTA datasets enables the development of various computational methods for DTA prediction. Numerous deep learning-based methods have been proposed to predict affinities, some of which only utilize original sequence information or complex structures, but the effective combination of various information and protein-binding pockets have not been fully mined. Therefore, a new method that integrates available key information is urgently needed to predict DTA and accelerate the drug discovery process.Results: In this study, we propose a novel deep learning-based predictor termed DataDTA to estimate the affinities of drug-target pairs. DataDTA utilizes descriptors of predicted pockets and sequences of proteins, as well as low-dimensional molecular features and SMILES strings of compounds as inputs. Specifically, the pockets were predicted from the three-dimensional structure of proteins and their descriptors were extracted as the partial input features for DTA prediction. The molecular representation of compounds based on algebraic graph features was collected to supplement the input information of targets. Furthermore, to ensure effective learning of multiscale interaction features, a dual-interaction aggregation neural network strategy was developed. DataDTA was compared with state-of-the-art methods on different datasets, and the results showed that DataDTA is a reliable prediction tool for affinities estimation. Specifically, the concordance index (CI) of DataDTA is 0.806 and the Pearson correlation coefficient (R) value is 0.814 on the test dataset, which is higher than other methods.Availability and implementation: The codes and datasets of DataDTA are available at https://github.com/YanZhu06/DataDTA.\",\"PeriodicalId\":8903,\"journal\":{\"name\":\"Bioinformatics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2023-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516524/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/btad560\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btad560","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

动机：准确预测药物靶点结合亲和力（DTA）对药物发现至关重要。大规模DTA数据集出版的增加使得DTA预测的各种计算方法得以发展。已经提出了许多基于深度学习的方法来预测亲和力，其中一些方法只利用原始序列信息或复杂结构，但各种信息和蛋白质结合口袋的有效组合尚未得到充分挖掘。因此，迫切需要一种整合现有关键信息的新方法来预测DTA并加快药物发现过程。结果：在这项研究中，我们提出了一种新的基于深度学习的预测因子DataDTA来估计药物-靶标对的亲和力。DataDTA利用预测的蛋白质口袋和序列的描述符，以及低维分子特征和化合物的SMILES串作为输入。具体而言，从蛋白质的三维结构预测口袋，并提取它们的描述符作为DTA预测的部分输入特征。收集了基于代数图特征的化合物分子表示，以补充靶标的输入信息。此外，为了确保多尺度交互特征的有效学习，开发了一种双交互聚合神经网络策略。在不同的数据集上，将DataDTA与最先进的方法进行了比较，结果表明，DataDTA是一种可靠的亲和力估计预测工具。具体而言，在测试数据集上，DataDTA的一致性指数（CI）为0.806，Pearson相关系数（R）值为0.814，高于其他方法。可用性和实施：DataDTA的代码和数据集可在https://github.com/YanZhu06/DataDTA.

本文章由计算机程序翻译，如有差异，请以英文原文为准。

DataDTA: a multi-feature and dual-interaction aggregation framework for drug-target binding affinity prediction.

查看原文本刊更多论文

DataDTA: a multi-feature and dual-interaction aggregation framework for drug-target binding affinity prediction.

Motivation: Accurate prediction of drug-target binding affinity (DTA) is crucial for drug discovery. The increase in the publication of large-scale DTA datasets enables the development of various computational methods for DTA prediction. Numerous deep learning-based methods have been proposed to predict affinities, some of which only utilize original sequence information or complex structures, but the effective combination of various information and protein-binding pockets have not been fully mined. Therefore, a new method that integrates available key information is urgently needed to predict DTA and accelerate the drug discovery process.

Results: In this study, we propose a novel deep learning-based predictor termed DataDTA to estimate the affinities of drug-target pairs. DataDTA utilizes descriptors of predicted pockets and sequences of proteins, as well as low-dimensional molecular features and SMILES strings of compounds as inputs. Specifically, the pockets were predicted from the three-dimensional structure of proteins and their descriptors were extracted as the partial input features for DTA prediction. The molecular representation of compounds based on algebraic graph features was collected to supplement the input information of targets. Furthermore, to ensure effective learning of multiscale interaction features, a dual-interaction aggregation neural network strategy was developed. DataDTA was compared with state-of-the-art methods on different datasets, and the results showed that DataDTA is a reliable prediction tool for affinities estimation. Specifically, the concordance index (CI) of DataDTA is 0.806 and the Pearson correlation coefficient (R) value is 0.814 on the test dataset, which is higher than other methods.

Availability and implementation: The codes and datasets of DataDTA are available at https://github.com/YanZhu06/DataDTA.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Bioinformatics 生物-生化研究方法

CiteScore

11.20

自引率

5.20%

发文量

753

审稿时长

2.1 months

期刊介绍： The leading journal in its field, Bioinformatics publishes the highest quality scientific papers and review articles of interest to academic and industrial researchers. Its main focus is on new developments in genome bioinformatics and computational biology. Two distinct sections within the journal - Discovery Notes and Application Notes- focus on shorter papers; the former reporting biologically interesting discoveries using computational methods, the latter exploring the applications used for experiments.