{"title":"AMCF-RDP:一个基于自注意的多源级联框架,用于识别药物-蛋白质关系。","authors":"Zhanchao Li, Xiaoyu Li, Xiuli Tang, Yan Wang","doi":"10.1007/s11030-025-11337-w","DOIUrl":null,"url":null,"abstract":"<p><p>The identification of relationships between drugs and proteins not only helps in the study of pathological mechanisms but also in drug repositioning studies. However, conventional wet-lab methods are often plagued by issues such as being time-consuming, labour-intensive, and characterized by low accuracy. Therefore, the development of a theoretical computational method is imperative for the expeditious and precise identification of drug-protein relationships. In this study, a self-attention-based multi-source and cascade framework (AMCF-RDP) is developed to identify the drug-protein relationships. Embedded features and network topology features derived from the knowledge graph and complex network were employed to characterize the drug-protein relationships. A two-layer model was constructed using attention mechanism and fully connected layers and was used to predict whether a drug interacts with a protein and what type of interaction it is. The efficacy of the proposed method was evaluated and confirmed based on the non-redundant datasets, ablation experiments, and comparisons with machine learning algorithms and other state-of-the-art methods. Results from fivefold cross-validation demonstrate that the developed method can quickly and accurately recognize drug-protein interactions with an accuracy of 90.21%, a sensitivity of 90.35%, and a Matthews correlation coefficient of 0.8043. Furthermore, it can also distinguish the types of drug-protein interaction, achieving a macro-recall of 93.43% and a macro-F1 score of 0.9381. Compared to the methods described in the literature, the proposed method achieved an area under the receiver operating characteristic curve of 0.9176, representing an improvement of 0.4746. A total of 100,000 drug-protein associations were identified, some of which were confirmed through molecular docking, KEGG, and gene ontology analyses. The AMCF-RDP has been demonstrated to significantly improve the identification of drug-protein relationships. It is anticipated that this will serve as a valuable tool in the domains of drug development and the investigation of mechanisms of action.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AMCF-RDP: a self-attention-based multi-source and cascade framework for the identification of drug-protein relationships.\",\"authors\":\"Zhanchao Li, Xiaoyu Li, Xiuli Tang, Yan Wang\",\"doi\":\"10.1007/s11030-025-11337-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The identification of relationships between drugs and proteins not only helps in the study of pathological mechanisms but also in drug repositioning studies. However, conventional wet-lab methods are often plagued by issues such as being time-consuming, labour-intensive, and characterized by low accuracy. Therefore, the development of a theoretical computational method is imperative for the expeditious and precise identification of drug-protein relationships. In this study, a self-attention-based multi-source and cascade framework (AMCF-RDP) is developed to identify the drug-protein relationships. Embedded features and network topology features derived from the knowledge graph and complex network were employed to characterize the drug-protein relationships. A two-layer model was constructed using attention mechanism and fully connected layers and was used to predict whether a drug interacts with a protein and what type of interaction it is. The efficacy of the proposed method was evaluated and confirmed based on the non-redundant datasets, ablation experiments, and comparisons with machine learning algorithms and other state-of-the-art methods. Results from fivefold cross-validation demonstrate that the developed method can quickly and accurately recognize drug-protein interactions with an accuracy of 90.21%, a sensitivity of 90.35%, and a Matthews correlation coefficient of 0.8043. Furthermore, it can also distinguish the types of drug-protein interaction, achieving a macro-recall of 93.43% and a macro-F1 score of 0.9381. Compared to the methods described in the literature, the proposed method achieved an area under the receiver operating characteristic curve of 0.9176, representing an improvement of 0.4746. A total of 100,000 drug-protein associations were identified, some of which were confirmed through molecular docking, KEGG, and gene ontology analyses. The AMCF-RDP has been demonstrated to significantly improve the identification of drug-protein relationships. It is anticipated that this will serve as a valuable tool in the domains of drug development and the investigation of mechanisms of action.</p>\",\"PeriodicalId\":708,\"journal\":{\"name\":\"Molecular Diversity\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Diversity\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1007/s11030-025-11337-w\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Diversity","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1007/s11030-025-11337-w","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
AMCF-RDP: a self-attention-based multi-source and cascade framework for the identification of drug-protein relationships.
The identification of relationships between drugs and proteins not only helps in the study of pathological mechanisms but also in drug repositioning studies. However, conventional wet-lab methods are often plagued by issues such as being time-consuming, labour-intensive, and characterized by low accuracy. Therefore, the development of a theoretical computational method is imperative for the expeditious and precise identification of drug-protein relationships. In this study, a self-attention-based multi-source and cascade framework (AMCF-RDP) is developed to identify the drug-protein relationships. Embedded features and network topology features derived from the knowledge graph and complex network were employed to characterize the drug-protein relationships. A two-layer model was constructed using attention mechanism and fully connected layers and was used to predict whether a drug interacts with a protein and what type of interaction it is. The efficacy of the proposed method was evaluated and confirmed based on the non-redundant datasets, ablation experiments, and comparisons with machine learning algorithms and other state-of-the-art methods. Results from fivefold cross-validation demonstrate that the developed method can quickly and accurately recognize drug-protein interactions with an accuracy of 90.21%, a sensitivity of 90.35%, and a Matthews correlation coefficient of 0.8043. Furthermore, it can also distinguish the types of drug-protein interaction, achieving a macro-recall of 93.43% and a macro-F1 score of 0.9381. Compared to the methods described in the literature, the proposed method achieved an area under the receiver operating characteristic curve of 0.9176, representing an improvement of 0.4746. A total of 100,000 drug-protein associations were identified, some of which were confirmed through molecular docking, KEGG, and gene ontology analyses. The AMCF-RDP has been demonstrated to significantly improve the identification of drug-protein relationships. It is anticipated that this will serve as a valuable tool in the domains of drug development and the investigation of mechanisms of action.
期刊介绍:
Molecular Diversity is a new publication forum for the rapid publication of refereed papers dedicated to describing the development, application and theory of molecular diversity and combinatorial chemistry in basic and applied research and drug discovery. The journal publishes both short and full papers, perspectives, news and reviews dealing with all aspects of the generation of molecular diversity, application of diversity for screening against alternative targets of all types (biological, biophysical, technological), analysis of results obtained and their application in various scientific disciplines/approaches including:
combinatorial chemistry and parallel synthesis;
small molecule libraries;
microwave synthesis;
flow synthesis;
fluorous synthesis;
diversity oriented synthesis (DOS);
nanoreactors;
click chemistry;
multiplex technologies;
fragment- and ligand-based design;
structure/function/SAR;
computational chemistry and molecular design;
chemoinformatics;
screening techniques and screening interfaces;
analytical and purification methods;
robotics, automation and miniaturization;
targeted libraries;
display libraries;
peptides and peptoids;
proteins;
oligonucleotides;
carbohydrates;
natural diversity;
new methods of library formulation and deconvolution;
directed evolution, origin of life and recombination;
search techniques, landscapes, random chemistry and more;