Ivan Khokhlov, Anna Tashchilova, Nikolai Bugaev-Makarovskiy, Olga Glushkova, Vladimir Yudin, Anton Keskinov, Sergey Yudin, Dmitry Svetlichnyy, Veronika Skvortsova
{"title":"drug- form - dta:走向现实世界的药物-靶点结合亲和力模型。","authors":"Ivan Khokhlov, Anna Tashchilova, Nikolai Bugaev-Makarovskiy, Olga Glushkova, Vladimir Yudin, Anton Keskinov, Sergey Yudin, Dmitry Svetlichnyy, Veronika Skvortsova","doi":"10.1016/j.csbj.2025.09.023","DOIUrl":null,"url":null,"abstract":"<p><p>Drug-target affinity (DTA) prediction is a fundamental challenge in drug discovery. Computational methods for predicting DTA can greatly assist drug design by narrowing the search space and reducing the number of protein-ligand complexes with low affinity. Currently DTA approaches often do not require three-dimensional (3D) structural information of proteins, which is frequently unavailable. In this study we present the DrugForm-DTA model, which uses only structure-less representations of ligand and protein. It is a Transformer-based neural network with protein encoding based on ESM-2, and small molecule ligand encoding obtained with Chemformer. We evaluated the model on the standard benchmarks Davis and KIBA, and revealed superior performance of DrugForm-DTA with the best result for KIBA. Moreover, we developed a ready-to-use model trained on the BindingDB dataset which was subjected to high-quality filtering and transformation. Overall, our method predicts drug-target affinity values with a confidence level comparable to that of a single <i>in vitro</i> experiment. Also, we compared DrugForm-DTA against molecular modeling methods and revealed higher efficacy of the developed model for drug-target affinity predictions. Our investigation provides a high accuracy neural network model with performance comparable to that of experimental measurements, a filtered.and reassessed BindingDB dataset for further usage, and demonstrates the outstanding applicability of the proposed method for DTA prediction.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"4106-4120"},"PeriodicalIF":4.1000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12495441/pdf/","citationCount":"0","resultStr":"{\"title\":\"DrugForm-DTA: Towards real-world drug-target binding affinity model.\",\"authors\":\"Ivan Khokhlov, Anna Tashchilova, Nikolai Bugaev-Makarovskiy, Olga Glushkova, Vladimir Yudin, Anton Keskinov, Sergey Yudin, Dmitry Svetlichnyy, Veronika Skvortsova\",\"doi\":\"10.1016/j.csbj.2025.09.023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Drug-target affinity (DTA) prediction is a fundamental challenge in drug discovery. Computational methods for predicting DTA can greatly assist drug design by narrowing the search space and reducing the number of protein-ligand complexes with low affinity. Currently DTA approaches often do not require three-dimensional (3D) structural information of proteins, which is frequently unavailable. In this study we present the DrugForm-DTA model, which uses only structure-less representations of ligand and protein. It is a Transformer-based neural network with protein encoding based on ESM-2, and small molecule ligand encoding obtained with Chemformer. We evaluated the model on the standard benchmarks Davis and KIBA, and revealed superior performance of DrugForm-DTA with the best result for KIBA. Moreover, we developed a ready-to-use model trained on the BindingDB dataset which was subjected to high-quality filtering and transformation. Overall, our method predicts drug-target affinity values with a confidence level comparable to that of a single <i>in vitro</i> experiment. Also, we compared DrugForm-DTA against molecular modeling methods and revealed higher efficacy of the developed model for drug-target affinity predictions. Our investigation provides a high accuracy neural network model with performance comparable to that of experimental measurements, a filtered.and reassessed BindingDB dataset for further usage, and demonstrates the outstanding applicability of the proposed method for DTA prediction.</p>\",\"PeriodicalId\":10715,\"journal\":{\"name\":\"Computational and structural biotechnology journal\",\"volume\":\"27 \",\"pages\":\"4106-4120\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12495441/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational and structural biotechnology journal\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1016/j.csbj.2025.09.023\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational and structural biotechnology journal","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.csbj.2025.09.023","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
药物靶标亲和力(DTA)预测是药物发现中的一个基本挑战。预测DTA的计算方法可以通过缩小搜索空间和减少低亲和力蛋白质配体复合物的数量来极大地辅助药物设计。目前的DTA方法通常不需要蛋白质的三维(3D)结构信息,这些信息通常是不可用的。在这项研究中,我们提出了DrugForm-DTA模型,它只使用配体和蛋白质的无结构表示。它是一个基于transformer的神经网络,其中蛋白质编码基于ESM-2,小分子配体编码由Chemformer获得。我们在Davis和KIBA的标准基准上对模型进行了评估,结果显示drug - form - dta具有优异的性能,其中KIBA的效果最好。此外,我们开发了一个在BindingDB数据集上训练的即用型模型,该模型经过了高质量的过滤和转换。总的来说,我们的方法预测药物靶点亲和力值的置信度与单一体外实验相当。此外,我们将DrugForm-DTA与分子模型方法进行了比较,发现所开发的模型在药物靶点亲和力预测方面具有更高的功效。我们的研究提供了一个高精度的神经网络模型,其性能可与实验测量相媲美。并重新评估了BindingDB数据集以供进一步使用,并证明了该方法在DTA预测中的突出适用性。
DrugForm-DTA: Towards real-world drug-target binding affinity model.
Drug-target affinity (DTA) prediction is a fundamental challenge in drug discovery. Computational methods for predicting DTA can greatly assist drug design by narrowing the search space and reducing the number of protein-ligand complexes with low affinity. Currently DTA approaches often do not require three-dimensional (3D) structural information of proteins, which is frequently unavailable. In this study we present the DrugForm-DTA model, which uses only structure-less representations of ligand and protein. It is a Transformer-based neural network with protein encoding based on ESM-2, and small molecule ligand encoding obtained with Chemformer. We evaluated the model on the standard benchmarks Davis and KIBA, and revealed superior performance of DrugForm-DTA with the best result for KIBA. Moreover, we developed a ready-to-use model trained on the BindingDB dataset which was subjected to high-quality filtering and transformation. Overall, our method predicts drug-target affinity values with a confidence level comparable to that of a single in vitro experiment. Also, we compared DrugForm-DTA against molecular modeling methods and revealed higher efficacy of the developed model for drug-target affinity predictions. Our investigation provides a high accuracy neural network model with performance comparable to that of experimental measurements, a filtered.and reassessed BindingDB dataset for further usage, and demonstrates the outstanding applicability of the proposed method for DTA prediction.
期刊介绍:
Computational and Structural Biotechnology Journal (CSBJ) is an online gold open access journal publishing research articles and reviews after full peer review. All articles are published, without barriers to access, immediately upon acceptance. The journal places a strong emphasis on functional and mechanistic understanding of how molecular components in a biological process work together through the application of computational methods. Structural data may provide such insights, but they are not a pre-requisite for publication in the journal. Specific areas of interest include, but are not limited to:
Structure and function of proteins, nucleic acids and other macromolecules
Structure and function of multi-component complexes
Protein folding, processing and degradation
Enzymology
Computational and structural studies of plant systems
Microbial Informatics
Genomics
Proteomics
Metabolomics
Algorithms and Hypothesis in Bioinformatics
Mathematical and Theoretical Biology
Computational Chemistry and Drug Discovery
Microscopy and Molecular Imaging
Nanotechnology
Systems and Synthetic Biology