Madhav V. Samudrala, Somanath Dandibhotla, Arjun Kaneriya and Sivanesan Dakshanamurthy*,
{"title":"基于新型相互作用的图神经网络框架的蛋白质-配体结合亲和力预测","authors":"Madhav V. Samudrala, Somanath Dandibhotla, Arjun Kaneriya and Sivanesan Dakshanamurthy*, ","doi":"10.1021/acsbiomedchemau.5c0005310.1021/acsbiomedchemau.5c00053","DOIUrl":null,"url":null,"abstract":"<p >Rapid prediction of protein–ligand binding affinity is important in the drug discovery process. The advent of machine learning methods has increased the speed of these predictions. Previous machine learning models based on structural, sequence, and interaction-based approaches have shown potential but often tend to memorize training data due to incomplete feature representations that lead to poor generalization on external complexes. To address this challenge, here, we developed PLAIG, a Graph Neural Network (GNN)-based machine learning framework for generalized binding affinity prediction. PLAIG represents binding complexes as graphs, integrating protein–ligand interactions and molecular topology to uniquely capture interaction and structural features. To reduce overfitting, we tested principal component analysis (PCA) and ensemble learning with a stacking regressor. During benchmarking, PLAIG achieved a PCC of 0.78 on 4852 complexes from the PDBbind v.2019 refined set and 0.82 on 285 complexes from the v.2016 core set, outperforming many existing models. External validation on the DUDE-Z data set demonstrated its ability to differentiate active ligands from decoys, achieving an average AUC of 0.69 and a maximum AUC of 0.89. To enrich de novo prediction capabilities for subsequent model versions, PLAIG was hybridized with sequence- and structure-based models. The hybrid models achieved an average PCC of 0.88 on well-known drug–target complexes, with the best reaching a PCC of 0.98. Future work will incorporate an explicit inclusion of a docking methodology into PLAIG’s pipeline and assess its performance on de novo ligands. PLAIG is freely available at https://plaig-demo.streamlit.app/.</p>","PeriodicalId":29802,"journal":{"name":"ACS Bio & Med Chem Au","volume":"5 3","pages":"447–463 447–463"},"PeriodicalIF":3.8000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acsbiomedchemau.5c00053","citationCount":"0","resultStr":"{\"title\":\"PLAIG: Protein–Ligand Binding Affinity Prediction Using a Novel Interaction-Based Graph Neural Network Framework\",\"authors\":\"Madhav V. Samudrala, Somanath Dandibhotla, Arjun Kaneriya and Sivanesan Dakshanamurthy*, \",\"doi\":\"10.1021/acsbiomedchemau.5c0005310.1021/acsbiomedchemau.5c00053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Rapid prediction of protein–ligand binding affinity is important in the drug discovery process. The advent of machine learning methods has increased the speed of these predictions. Previous machine learning models based on structural, sequence, and interaction-based approaches have shown potential but often tend to memorize training data due to incomplete feature representations that lead to poor generalization on external complexes. To address this challenge, here, we developed PLAIG, a Graph Neural Network (GNN)-based machine learning framework for generalized binding affinity prediction. PLAIG represents binding complexes as graphs, integrating protein–ligand interactions and molecular topology to uniquely capture interaction and structural features. To reduce overfitting, we tested principal component analysis (PCA) and ensemble learning with a stacking regressor. During benchmarking, PLAIG achieved a PCC of 0.78 on 4852 complexes from the PDBbind v.2019 refined set and 0.82 on 285 complexes from the v.2016 core set, outperforming many existing models. External validation on the DUDE-Z data set demonstrated its ability to differentiate active ligands from decoys, achieving an average AUC of 0.69 and a maximum AUC of 0.89. To enrich de novo prediction capabilities for subsequent model versions, PLAIG was hybridized with sequence- and structure-based models. The hybrid models achieved an average PCC of 0.88 on well-known drug–target complexes, with the best reaching a PCC of 0.98. Future work will incorporate an explicit inclusion of a docking methodology into PLAIG’s pipeline and assess its performance on de novo ligands. PLAIG is freely available at https://plaig-demo.streamlit.app/.</p>\",\"PeriodicalId\":29802,\"journal\":{\"name\":\"ACS Bio & Med Chem Au\",\"volume\":\"5 3\",\"pages\":\"447–463 447–463\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://pubs.acs.org/doi/epdf/10.1021/acsbiomedchemau.5c00053\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Bio & Med Chem Au\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acsbiomedchemau.5c00053\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Bio & Med Chem Au","FirstCategoryId":"1085","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acsbiomedchemau.5c00053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
PLAIG: Protein–Ligand Binding Affinity Prediction Using a Novel Interaction-Based Graph Neural Network Framework
Rapid prediction of protein–ligand binding affinity is important in the drug discovery process. The advent of machine learning methods has increased the speed of these predictions. Previous machine learning models based on structural, sequence, and interaction-based approaches have shown potential but often tend to memorize training data due to incomplete feature representations that lead to poor generalization on external complexes. To address this challenge, here, we developed PLAIG, a Graph Neural Network (GNN)-based machine learning framework for generalized binding affinity prediction. PLAIG represents binding complexes as graphs, integrating protein–ligand interactions and molecular topology to uniquely capture interaction and structural features. To reduce overfitting, we tested principal component analysis (PCA) and ensemble learning with a stacking regressor. During benchmarking, PLAIG achieved a PCC of 0.78 on 4852 complexes from the PDBbind v.2019 refined set and 0.82 on 285 complexes from the v.2016 core set, outperforming many existing models. External validation on the DUDE-Z data set demonstrated its ability to differentiate active ligands from decoys, achieving an average AUC of 0.69 and a maximum AUC of 0.89. To enrich de novo prediction capabilities for subsequent model versions, PLAIG was hybridized with sequence- and structure-based models. The hybrid models achieved an average PCC of 0.88 on well-known drug–target complexes, with the best reaching a PCC of 0.98. Future work will incorporate an explicit inclusion of a docking methodology into PLAIG’s pipeline and assess its performance on de novo ligands. PLAIG is freely available at https://plaig-demo.streamlit.app/.
期刊介绍:
ACS Bio & Med Chem Au is a broad scope open access journal which publishes short letters comprehensive articles reviews and perspectives in all aspects of biological and medicinal chemistry. Studies providing fundamental insights or describing novel syntheses as well as clinical or other applications-based work are welcomed.This broad scope includes experimental and theoretical studies on the chemical physical mechanistic and/or structural basis of biological or cell function in all domains of life. It encompasses the fields of chemical biology synthetic biology disease biology cell biology agriculture and food natural products research nucleic acid biology neuroscience structural biology and biophysics.The journal publishes studies that pertain to a broad range of medicinal chemistry including compound design and optimization biological evaluation molecular mechanistic understanding of drug delivery and drug delivery systems imaging agents and pharmacology and translational science of both small and large bioactive molecules. Novel computational cheminformatics and structural studies for the identification (or structure-activity relationship analysis) of bioactive molecules ligands and their targets are also welcome. The journal will consider computational studies applying established computational methods but only in combination with novel and original experimental data (e.g. in cases where new compounds have been designed and tested).Also included in the scope of the journal are articles relating to infectious diseases research on pathogens host-pathogen interactions therapeutics diagnostics vaccines drug-delivery systems and other biomedical technology development pertaining to infectious diseases.