Lenin Domínguez-Ramírez, Maricruz Anaya-Ruiz, Paulina Cortés-Hernández
{"title":"Quality over quantity: how to get the best results when using docking for repurposing.","authors":"Lenin Domínguez-Ramírez, Maricruz Anaya-Ruiz, Paulina Cortés-Hernández","doi":"10.3389/fbinf.2025.1536504","DOIUrl":null,"url":null,"abstract":"<p><p>Molecular docking is among the fastest and most readily available computational tools to explore protein-ligand interactions. However, little effort has been put into assessing the quality of its results. In this paper, we compared eight free license docking programs to screen a drug library against the human target, phosphodiesterase 5A (PDE5A), to evaluate their ability to find its known ligand, sildenafil, and other ligands that became erectile dysfunction drugs because they inhibit this target. GNINA was superior at identifying the known target because it offers a convolutional neural network (CNN) score that ranks the quality of docking results. Using this CNN score improved the ranking of known positives. Receiver operating characteristic (ROC) analysis revealed that all docking suites lack specificity; that is, they often misidentify true negatives. Employing a CNN score cutoff before ranking by docking affinity raised specificity with a small loss in sensitivity. After the cutoff, datasets became smaller but of higher quality. We propose a heuristic to produce relevant docking results, which includes an overall evaluation of the target on docking performance through ROC and an improvement of candidate binder selection using a CNN score cutoff of 0.9.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1536504"},"PeriodicalIF":2.8000,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12146287/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fbinf.2025.1536504","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Molecular docking is among the fastest and most readily available computational tools to explore protein-ligand interactions. However, little effort has been put into assessing the quality of its results. In this paper, we compared eight free license docking programs to screen a drug library against the human target, phosphodiesterase 5A (PDE5A), to evaluate their ability to find its known ligand, sildenafil, and other ligands that became erectile dysfunction drugs because they inhibit this target. GNINA was superior at identifying the known target because it offers a convolutional neural network (CNN) score that ranks the quality of docking results. Using this CNN score improved the ranking of known positives. Receiver operating characteristic (ROC) analysis revealed that all docking suites lack specificity; that is, they often misidentify true negatives. Employing a CNN score cutoff before ranking by docking affinity raised specificity with a small loss in sensitivity. After the cutoff, datasets became smaller but of higher quality. We propose a heuristic to produce relevant docking results, which includes an overall evaluation of the target on docking performance through ROC and an improvement of candidate binder selection using a CNN score cutoff of 0.9.