{"title":"Exploiting Pre-trained Models for Drug Target Affinity Prediction with Nearest Neighbors","authors":"Qizhi Pei, Lijun Wu, Zhenyu He, Jinhua Zhu, Yingce Xia, Shufang Xie, Rui Yan","doi":"arxiv-2407.15202","DOIUrl":null,"url":null,"abstract":"Drug-Target binding Affinity (DTA) prediction is essential for drug\ndiscovery. Despite the application of deep learning methods to DTA prediction,\nthe achieved accuracy remain suboptimal. In this work, inspired by the recent\nsuccess of retrieval methods, we propose $k$NN-DTA, a non-parametric\nembedding-based retrieval method adopted on a pre-trained DTA prediction model,\nwhich can extend the power of the DTA model with no or negligible cost.\nDifferent from existing methods, we introduce two neighbor aggregation ways\nfrom both embedding space and label space that are integrated into a unified\nframework. Specifically, we propose a \\emph{label aggregation} with\n\\emph{pair-wise retrieval} and a \\emph{representation aggregation} with\n\\emph{point-wise retrieval} of the nearest neighbors. This method executes in\nthe inference phase and can efficiently boost the DTA prediction performance\nwith no training cost. In addition, we propose an extension, Ada-$k$NN-DTA, an\ninstance-wise and adaptive aggregation with lightweight learning. Results on\nfour benchmark datasets show that $k$NN-DTA brings significant improvements,\noutperforming previous state-of-the-art (SOTA) results, e.g, on BindingDB\nIC$_{50}$ and $K_i$ testbeds, $k$NN-DTA obtains new records of RMSE\n$\\bf{0.684}$ and $\\bf{0.750}$. The extended Ada-$k$NN-DTA further improves the\nperformance to be $\\bf{0.675}$ and $\\bf{0.735}$ RMSE. These results strongly\nprove the effectiveness of our method. Results in other settings and\ncomprehensive studies/analyses also show the great potential of our $k$NN-DTA\napproach.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.15202","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Drug-Target binding Affinity (DTA) prediction is essential for drug
discovery. Despite the application of deep learning methods to DTA prediction,
the achieved accuracy remain suboptimal. In this work, inspired by the recent
success of retrieval methods, we propose $k$NN-DTA, a non-parametric
embedding-based retrieval method adopted on a pre-trained DTA prediction model,
which can extend the power of the DTA model with no or negligible cost.
Different from existing methods, we introduce two neighbor aggregation ways
from both embedding space and label space that are integrated into a unified
framework. Specifically, we propose a \emph{label aggregation} with
\emph{pair-wise retrieval} and a \emph{representation aggregation} with
\emph{point-wise retrieval} of the nearest neighbors. This method executes in
the inference phase and can efficiently boost the DTA prediction performance
with no training cost. In addition, we propose an extension, Ada-$k$NN-DTA, an
instance-wise and adaptive aggregation with lightweight learning. Results on
four benchmark datasets show that $k$NN-DTA brings significant improvements,
outperforming previous state-of-the-art (SOTA) results, e.g, on BindingDB
IC$_{50}$ and $K_i$ testbeds, $k$NN-DTA obtains new records of RMSE
$\bf{0.684}$ and $\bf{0.750}$. The extended Ada-$k$NN-DTA further improves the
performance to be $\bf{0.675}$ and $\bf{0.735}$ RMSE. These results strongly
prove the effectiveness of our method. Results in other settings and
comprehensive studies/analyses also show the great potential of our $k$NN-DTA
approach.