Incorporating Neighboring Protein Features for Enhanced Drug–Target Interaction Prediction: A Comparative Analysis of Similarity-Based Alignment Methods

IF 5.3 2区化学 Q1 CHEMISTRY, MEDICINAL

Journal of Chemical Information and Modeling Pub Date : 2025-07-04 DOI:10.1021/acs.jcim.5c00979

Xiaoqing Ru, Chao Zha and Xin Gao*,

{"title":"Incorporating Neighboring Protein Features for Enhanced Drug–Target Interaction Prediction: A Comparative Analysis of Similarity-Based Alignment Methods","authors":"Xiaoqing Ru, Chao Zha and Xin Gao*, ","doi":"10.1021/acs.jcim.5c00979","DOIUrl":null,"url":null,"abstract":"<p >Drug–target interaction (DTI) prediction is a fundamental computational task in drug discovery. Despite recent advancements, existing approaches often suffer from data sparsity and fail to capture the intricate nature of molecular interactions, limiting predictive performance. To address these challenges, we propose a novel DTI prediction framework that enhances both accuracy and interpretability by incorporating features from highly similar protein neighbors. Our framework extracts chemical and physicochemical features from drug–target binding affinity data and integrates interaction features from highly similar protein neighbors to enrich representation. To identify these neighbors, we employ a range of protein similarity alignment algorithms, including BLAST, MUSCLE, MAFFT, Clustal Omega and Foldseek. Experiments on the Davis and KIBA data sets demonstrate that incorporating features from high-similarity neighbors substantially improves prediction accuracy. Further analysis reveals that top-ranked neighbors contribute the most to performance gains, underscoring the importance of similarity-based feature augmentation. Additionally, comparisons among alignment methods highlight their robustness in neighbor selection, and case studies confirm the biological relevance of shared targets among closely related proteins. Overall, our framework presents a novel solution to data sparsity, improves predictive performance, and enhances model interpretability. This work lays a solid foundation for precise DTI prediction and provides valuable insights for advancing computational methods in drug discovery.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"65 14","pages":"7701–7711"},"PeriodicalIF":5.3000,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jcim.5c00979","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}

引用次数: 0

Abstract

Drug–target interaction (DTI) prediction is a fundamental computational task in drug discovery. Despite recent advancements, existing approaches often suffer from data sparsity and fail to capture the intricate nature of molecular interactions, limiting predictive performance. To address these challenges, we propose a novel DTI prediction framework that enhances both accuracy and interpretability by incorporating features from highly similar protein neighbors. Our framework extracts chemical and physicochemical features from drug–target binding affinity data and integrates interaction features from highly similar protein neighbors to enrich representation. To identify these neighbors, we employ a range of protein similarity alignment algorithms, including BLAST, MUSCLE, MAFFT, Clustal Omega and Foldseek. Experiments on the Davis and KIBA data sets demonstrate that incorporating features from high-similarity neighbors substantially improves prediction accuracy. Further analysis reveals that top-ranked neighbors contribute the most to performance gains, underscoring the importance of similarity-based feature augmentation. Additionally, comparisons among alignment methods highlight their robustness in neighbor selection, and case studies confirm the biological relevance of shared targets among closely related proteins. Overall, our framework presents a novel solution to data sparsity, improves predictive performance, and enhances model interpretability. This work lays a solid foundation for precise DTI prediction and provides valuable insights for advancing computational methods in drug discovery.

Abstract Image

查看原文本刊更多论文

结合邻近蛋白特征增强药物-靶标相互作用预测：基于相似性的比对方法的比较分析。

药物-靶标相互作用（DTI）预测是药物发现中的一项基本计算任务。尽管最近取得了进展，但现有的方法经常受到数据稀疏性的影响，无法捕捉分子相互作用的复杂本质，从而限制了预测性能。为了解决这些挑战，我们提出了一种新的DTI预测框架，通过结合高度相似的蛋白质邻居的特征来提高准确性和可解释性。我们的框架从药物靶点结合亲和数据中提取化学和物理化学特征，并整合高度相似的蛋白质邻居的相互作用特征，以丰富表征。为了识别这些邻居，我们使用了一系列蛋白质相似性比对算法，包括BLAST、MUSCLE、MAFFT、Clustal Omega和Foldseek。在Davis和KIBA数据集上的实验表明，结合高相似性邻居的特征可以显著提高预测精度。进一步的分析表明，排名靠前的邻居对性能提升的贡献最大，强调了基于相似性的特征增强的重要性。此外，比对方法之间的比较突出了它们在邻居选择方面的稳健性，案例研究证实了密切相关蛋白之间共享靶点的生物学相关性。总的来说，我们的框架提出了一种新的数据稀疏性解决方案，提高了预测性能，增强了模型的可解释性。这项工作为精确预测DTI奠定了坚实的基础，并为推进药物发现的计算方法提供了有价值的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Chemical Information and Modeling 化学-化学综合

CiteScore

9.80

自引率

10.70%

发文量

529

审稿时长

1.4 months

期刊介绍： The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery. Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field. As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.