Helen Lai, Tingyu Wang, Hassan Sirelkhatim, Joe Eaton, Howard Huang, Brad Rees, Ola Engkvist, Jon Paul Janet, Xiaoyun Wang, Alessandro Tibo
{"title":"Improving protein-ligand complex generation with force field guidance","authors":"Helen Lai, Tingyu Wang, Hassan Sirelkhatim, Joe Eaton, Howard Huang, Brad Rees, Ola Engkvist, Jon Paul Janet, Xiaoyun Wang, Alessandro Tibo","doi":"10.1186/s13321-026-01198-2","DOIUrl":null,"url":null,"abstract":"<p>Generative models based on diffusion and flow matching have recently been applied to structure-based drug design, but their outputs often include unrealistic protein–ligand interactions that do not obey the laws of physics. We present an energy guidance framework that incorporates a molecular mechanics force field (MMFF94) directly into the sampling process. The method steers molecular generation toward more physically plausible and energetically stable conformations without retraining the underlying model. We evaluate this approach using two state-of-the-art architectures, SemlaFlow, a flow matching model and EDM, a diffusion model, on the PDBBind dataset. Across both models, energy guidance improves enthalpic interaction energy, improves strain energy by up to 75<span>\\(\\%\\)</span>, and generates over 1000 ligands with better docking scores than native ligands. These results demonstrate that lightweight, physics-based guidance can significantly enhance generative drug design while preserving chemical validity and diversity.</p><p>We introduce a novel, <i>training-free force field guidance</i> framework that steers ligand generation using empirical molecular mechanics (e.g., MMFF94) during diffusion or flow-based sampling–without modifying or retraining the base generative model (e.g., EDM or Semflaflow by [24]). Our method operates as a plug-in during inference time, leveraging energy feedback to generate poses with lower strain and having better predicted interactions with the protein structure.</p><p>Our main contributions are as follows:</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"18 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2026-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1186/s13321-026-01198-2.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-026-01198-2","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/5/3 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Generative models based on diffusion and flow matching have recently been applied to structure-based drug design, but their outputs often include unrealistic protein–ligand interactions that do not obey the laws of physics. We present an energy guidance framework that incorporates a molecular mechanics force field (MMFF94) directly into the sampling process. The method steers molecular generation toward more physically plausible and energetically stable conformations without retraining the underlying model. We evaluate this approach using two state-of-the-art architectures, SemlaFlow, a flow matching model and EDM, a diffusion model, on the PDBBind dataset. Across both models, energy guidance improves enthalpic interaction energy, improves strain energy by up to 75\(\%\), and generates over 1000 ligands with better docking scores than native ligands. These results demonstrate that lightweight, physics-based guidance can significantly enhance generative drug design while preserving chemical validity and diversity.
We introduce a novel, training-free force field guidance framework that steers ligand generation using empirical molecular mechanics (e.g., MMFF94) during diffusion or flow-based sampling–without modifying or retraining the base generative model (e.g., EDM or Semflaflow by [24]). Our method operates as a plug-in during inference time, leveraging energy feedback to generate poses with lower strain and having better predicted interactions with the protein structure.
期刊介绍:
Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling.
Coverage includes, but is not limited to:
chemical information systems, software and databases, and molecular modelling,
chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases,
computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.