Ali Khodabandeh Yalabadi, Mehdi Yazdani-Jahromi, Ozlem Ozmen Garibay
{"title":"BoKDiff: best-of-K diffusion alignment for target-specific 3D molecule generation.","authors":"Ali Khodabandeh Yalabadi, Mehdi Yazdani-Jahromi, Ozlem Ozmen Garibay","doi":"10.1093/bioadv/vbaf137","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Structure-based drug design (SBDD) leverages the 3D structure of target proteins to guide therapeutic development. While generative models like diffusion models and geometric deep learning show promise in ligand design, challenges such as limited protein-ligand data and poor alignment reduce their effectiveness. We introduce BoKDiff, a domain-adapted framework inspired by alignment strategies in large language and vision models that combines multi-objective optimization with Best-of-K alignment to enhance ligand generation.</p><p><strong>Results: </strong>Built on DecompDiff, BoKDiff generates diverse ligands and ranks them using a weighted score based on QED, SA, and docking metrics. To overcome alignment issues, we reposition each ligand's center of mass to match its docking pose, enabling more accurate sub-component extraction. We further incorporate a Best-of-N (BoN) sampling strategy to select optimal candidates without model fine-tuning. BoN achieves QED > 0.6, SA > 0.75, and over 35% success rate. BoKDiff outperforms prior models on the CrossDocked2020 dataset with an average docking score of -8.58 and 26% valid molecule generation rate. This is the first study to integrate Best-of-K alignment and BoN sampling into SBDD, demonstrating their potential for practical, high-quality ligand design.</p><p><strong>Availability and implementation: </strong>Code is available at https://github.com/khodabandeh-ali/BoKDiff.git.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf137"},"PeriodicalIF":2.8000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12228967/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbaf137","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: Structure-based drug design (SBDD) leverages the 3D structure of target proteins to guide therapeutic development. While generative models like diffusion models and geometric deep learning show promise in ligand design, challenges such as limited protein-ligand data and poor alignment reduce their effectiveness. We introduce BoKDiff, a domain-adapted framework inspired by alignment strategies in large language and vision models that combines multi-objective optimization with Best-of-K alignment to enhance ligand generation.
Results: Built on DecompDiff, BoKDiff generates diverse ligands and ranks them using a weighted score based on QED, SA, and docking metrics. To overcome alignment issues, we reposition each ligand's center of mass to match its docking pose, enabling more accurate sub-component extraction. We further incorporate a Best-of-N (BoN) sampling strategy to select optimal candidates without model fine-tuning. BoN achieves QED > 0.6, SA > 0.75, and over 35% success rate. BoKDiff outperforms prior models on the CrossDocked2020 dataset with an average docking score of -8.58 and 26% valid molecule generation rate. This is the first study to integrate Best-of-K alignment and BoN sampling into SBDD, demonstrating their potential for practical, high-quality ligand design.
Availability and implementation: Code is available at https://github.com/khodabandeh-ali/BoKDiff.git.