{"title":"TC-DTA:利用变压器和卷积神经网络预测药物与目标的结合亲和力。","authors":"Xiwei Tang, Yiqiang Zhou, Mengyun Yang, Wenjun Li","doi":"10.1109/TNB.2024.3441590","DOIUrl":null,"url":null,"abstract":"<p><p>Bioinformatics is a rapidly growing field involving the application of computational methods to the analysis and interpretation of biological data. An important task in bioinformatics is the identification of novel drug-target interactions (DTIs), which is also an important part of the drug discovery process. Most computational methods for predicting DTI consider it as a binary classification task to predict whether drug target pairs interact with each other. With the increasing amount of drug-target binding affinity data in recent years, this binary classification task can be transformed into a regression task of drug-target affinity (DTA), which reflects the degree of drug-target binding and can provide more detailed and specific information than DTI, making it an important tool in drug discovery through virtual screening. Effectively predicting how compounds interact with targets can help speed up the drug discovery process. In this study, we propose a deep learning model called TC-DTA for the prediction of the DTA, which makes use of the convolutional neural networks (CNN) and encoder module of the transformer architecture. First, the raw drug SMILES strings and protein amino acid sequences are extracted from the dataset. These are then represented using different encoding methods. We then use CNN and the Transformer's encoder module to extract feature information from drug SMILES strings and protein amino acid sequences, respectively. Finally, the feature information obtained is concatenated and fed into a multi-layer perceptron for prediction of the binding affinity score. We evaluated our model on two benchmark DTA datasets, Davis and KIBA, against methods including KronRLS, SimBoost and DeepDTA. On evaluation metrics such as Mean Squared Error, Concordance Index and r<sup>2</sup><sub>m</sub> index, TC-DTA outperforms these baseline methods. These results demonstrate the effectiveness of the Transformer's encoder and CNN in the extraction of meaningful representations from sequences, thereby improving the accuracy of DTA prediction. The deep learning model for DTA prediction can accelerate drug discovery by identifying drug candidates with high binding affinity to specific targets. Compared to traditional methods, the use of machine learning technology allows for a more effective and efficient drug discovery process.</p>","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TC-DTA: predicting drug-target binding affinity with transformer and convolutional neural networks.\",\"authors\":\"Xiwei Tang, Yiqiang Zhou, Mengyun Yang, Wenjun Li\",\"doi\":\"10.1109/TNB.2024.3441590\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Bioinformatics is a rapidly growing field involving the application of computational methods to the analysis and interpretation of biological data. An important task in bioinformatics is the identification of novel drug-target interactions (DTIs), which is also an important part of the drug discovery process. Most computational methods for predicting DTI consider it as a binary classification task to predict whether drug target pairs interact with each other. With the increasing amount of drug-target binding affinity data in recent years, this binary classification task can be transformed into a regression task of drug-target affinity (DTA), which reflects the degree of drug-target binding and can provide more detailed and specific information than DTI, making it an important tool in drug discovery through virtual screening. Effectively predicting how compounds interact with targets can help speed up the drug discovery process. In this study, we propose a deep learning model called TC-DTA for the prediction of the DTA, which makes use of the convolutional neural networks (CNN) and encoder module of the transformer architecture. First, the raw drug SMILES strings and protein amino acid sequences are extracted from the dataset. These are then represented using different encoding methods. We then use CNN and the Transformer's encoder module to extract feature information from drug SMILES strings and protein amino acid sequences, respectively. Finally, the feature information obtained is concatenated and fed into a multi-layer perceptron for prediction of the binding affinity score. We evaluated our model on two benchmark DTA datasets, Davis and KIBA, against methods including KronRLS, SimBoost and DeepDTA. On evaluation metrics such as Mean Squared Error, Concordance Index and r<sup>2</sup><sub>m</sub> index, TC-DTA outperforms these baseline methods. These results demonstrate the effectiveness of the Transformer's encoder and CNN in the extraction of meaningful representations from sequences, thereby improving the accuracy of DTA prediction. The deep learning model for DTA prediction can accelerate drug discovery by identifying drug candidates with high binding affinity to specific targets. Compared to traditional methods, the use of machine learning technology allows for a more effective and efficient drug discovery process.</p>\",\"PeriodicalId\":13264,\"journal\":{\"name\":\"IEEE Transactions on NanoBioscience\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on NanoBioscience\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1109/TNB.2024.3441590\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on NanoBioscience","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1109/TNB.2024.3441590","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
TC-DTA: predicting drug-target binding affinity with transformer and convolutional neural networks.
Bioinformatics is a rapidly growing field involving the application of computational methods to the analysis and interpretation of biological data. An important task in bioinformatics is the identification of novel drug-target interactions (DTIs), which is also an important part of the drug discovery process. Most computational methods for predicting DTI consider it as a binary classification task to predict whether drug target pairs interact with each other. With the increasing amount of drug-target binding affinity data in recent years, this binary classification task can be transformed into a regression task of drug-target affinity (DTA), which reflects the degree of drug-target binding and can provide more detailed and specific information than DTI, making it an important tool in drug discovery through virtual screening. Effectively predicting how compounds interact with targets can help speed up the drug discovery process. In this study, we propose a deep learning model called TC-DTA for the prediction of the DTA, which makes use of the convolutional neural networks (CNN) and encoder module of the transformer architecture. First, the raw drug SMILES strings and protein amino acid sequences are extracted from the dataset. These are then represented using different encoding methods. We then use CNN and the Transformer's encoder module to extract feature information from drug SMILES strings and protein amino acid sequences, respectively. Finally, the feature information obtained is concatenated and fed into a multi-layer perceptron for prediction of the binding affinity score. We evaluated our model on two benchmark DTA datasets, Davis and KIBA, against methods including KronRLS, SimBoost and DeepDTA. On evaluation metrics such as Mean Squared Error, Concordance Index and r2m index, TC-DTA outperforms these baseline methods. These results demonstrate the effectiveness of the Transformer's encoder and CNN in the extraction of meaningful representations from sequences, thereby improving the accuracy of DTA prediction. The deep learning model for DTA prediction can accelerate drug discovery by identifying drug candidates with high binding affinity to specific targets. Compared to traditional methods, the use of machine learning technology allows for a more effective and efficient drug discovery process.
期刊介绍:
The IEEE Transactions on NanoBioscience reports on original, innovative and interdisciplinary work on all aspects of molecular systems, cellular systems, and tissues (including molecular electronics). Topics covered in the journal focus on a broad spectrum of aspects, both on foundations and on applications. Specifically, methods and techniques, experimental aspects, design and implementation, instrumentation and laboratory equipment, clinical aspects, hardware and software data acquisition and analysis and computer based modelling are covered (based on traditional or high performance computing - parallel computers or computer networks).