An image-based protein-ligand binding representation learning framework via multi-level flexible dynamics trajectory pre-training.

IF 5.4

Bioinformatics (Oxford, England) Pub Date : 2025-10-02 DOI:10.1093/bioinformatics/btaf535

Hongxin Xiang, Mingquan Liu, Linlin Hou, Shuting Jin, Jianmin Wang, Jun Xia, Wenjie Du, Sisi Yuan, Xiangzheng Fu, Xinyu Yang, Li Zeng, Lei Xu

{"title":"An image-based protein-ligand binding representation learning framework via multi-level flexible dynamics trajectory pre-training.","authors":"Hongxin Xiang, Mingquan Liu, Linlin Hou, Shuting Jin, Jianmin Wang, Jun Xia, Wenjie Du, Sisi Yuan, Xiangzheng Fu, Xinyu Yang, Li Zeng, Lei Xu","doi":"10.1093/bioinformatics/btaf535","DOIUrl":null,"url":null,"abstract":"Motivation: Accurate prediction of protein-ligand binding (PLB) relationships plays a crucial role in drug discovery, which helps identify drugs that modulate the activity of specific targets. Traditional biological assays for measuring PLB relationships are time consuming and costly. In addition, models for predicting PLB relationships have been developed and widely used in drug discovery tasks. However, learning more accurate PLB representations is essential to meet the stringent standards required for drug discovery.Results: We propose an image-based PLB representation learning framework, called ImagePLB, which equips ligand representation learner (LRL) and protein representation learner (PRL) to accept 3D multi-view ligand images and protein graphs as input, respectively, and learns rich interaction information between ligand and protein through a binding representation learner (BRL). Considering the scarcity of protein-ligand pairs, we further propose a multi-level next trajectory prediction (MLNTP) task to pre-train ImagePLB on the 4D flexible dynamics trajectory of 16 972 complexes, including ligand level, protein level, and complex level, to learn information related to trajectories. Besides, by introducing trajectory regularization (TR), we effectively alleviate the problem of high (even almost identical) feature similarity caused by adjacent trajectories. Compared with the current state-of-the-art methods, ImagePLB has achieved competitive improvements on PLB-related prediction tasks, including protein-ligand affinity and efficacy prediction tasks. This study opens the door to the image-based PLB learning paradigm.Availability and implementation: All data and implementation details of code can be obtained from https://github.com/HongxinXiang/ImagePLB.","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12502907/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf535","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Motivation: Accurate prediction of protein-ligand binding (PLB) relationships plays a crucial role in drug discovery, which helps identify drugs that modulate the activity of specific targets. Traditional biological assays for measuring PLB relationships are time consuming and costly. In addition, models for predicting PLB relationships have been developed and widely used in drug discovery tasks. However, learning more accurate PLB representations is essential to meet the stringent standards required for drug discovery.

Results: We propose an image-based PLB representation learning framework, called ImagePLB, which equips ligand representation learner (LRL) and protein representation learner (PRL) to accept 3D multi-view ligand images and protein graphs as input, respectively, and learns rich interaction information between ligand and protein through a binding representation learner (BRL). Considering the scarcity of protein-ligand pairs, we further propose a multi-level next trajectory prediction (MLNTP) task to pre-train ImagePLB on the 4D flexible dynamics trajectory of 16 972 complexes, including ligand level, protein level, and complex level, to learn information related to trajectories. Besides, by introducing trajectory regularization (TR), we effectively alleviate the problem of high (even almost identical) feature similarity caused by adjacent trajectories. Compared with the current state-of-the-art methods, ImagePLB has achieved competitive improvements on PLB-related prediction tasks, including protein-ligand affinity and efficacy prediction tasks. This study opens the door to the image-based PLB learning paradigm.

Availability and implementation: All data and implementation details of code can be obtained from https://github.com/HongxinXiang/ImagePLB.

查看原文本刊更多论文

基于多层次柔性动力学轨迹预训练的图像蛋白质-配体结合表征学习框架。

背景：准确预测蛋白质-配体结合（PLB）关系在药物发现中起着至关重要的作用，它有助于识别调节特定靶标活性的药物。测量PLB关系的传统生物分析既耗时又昂贵。此外，预测PLB关系的模型已被开发并广泛用于药物发现任务。然而，了解更准确的PLB表示对于满足药物发现所需的严格标准至关重要。结果：我们提出了一种基于图像的蛋白质-配体结合表征学习框架ImagePLB，该框架通过配体表征学习器（LRL）和蛋白质表征学习器（PRL）分别接受三维多视图配体图像和蛋白质图作为输入，并通过结合表征学习器（BRL）学习配体与蛋白质之间丰富的相互作用信息。考虑到蛋白质-配体对的稀缺性，我们进一步提出了多层次下一个轨迹预测（MLNTP）任务，在16,972个复合物（配体水平、蛋白质水平和复合物水平）的4D柔性动力学轨迹上对ImagePLB进行预训练，以学习与轨迹相关的信息。此外，通过引入轨迹正则化（TR），有效地缓解了相邻轨迹导致的特征高度相似（甚至几乎相同）的问题。结论：提出的预训练策略（MLNTP和TR）可以进一步提高ImagePLB的性能。与目前最先进的方法相比，ImagePLB在与plb相关的预测任务上取得了竞争性的改进，包括蛋白质配体亲和力和功效预测任务。本研究为基于图像的PLB学习范式打开了大门。可用性和实现：代码的所有数据和实现细节可从https://github.com/HongxinXiang/ImagePLB获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bioinformatics (Oxford, England)

自引率

0.00%

发文量