Improving protein-ligand complex generation with force field guidance

IF 5.7 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Journal of Cheminformatics Pub Date : 2026-05-02 Epub Date: 2026-05-03 DOI:10.1186/s13321-026-01198-2
Helen Lai, Tingyu Wang, Hassan Sirelkhatim, Joe Eaton, Howard Huang, Brad Rees, Ola Engkvist, Jon Paul Janet, Xiaoyun Wang, Alessandro Tibo
{"title":"Improving protein-ligand complex generation with force field guidance","authors":"Helen Lai,&nbsp;Tingyu Wang,&nbsp;Hassan Sirelkhatim,&nbsp;Joe Eaton,&nbsp;Howard Huang,&nbsp;Brad Rees,&nbsp;Ola Engkvist,&nbsp;Jon Paul Janet,&nbsp;Xiaoyun Wang,&nbsp;Alessandro Tibo","doi":"10.1186/s13321-026-01198-2","DOIUrl":null,"url":null,"abstract":"<p>Generative models based on diffusion and flow matching have recently been applied to structure-based drug design, but their outputs often include unrealistic protein–ligand interactions that do not obey the laws of physics. We present an energy guidance framework that incorporates a molecular mechanics force field (MMFF94) directly into the sampling process. The method steers molecular generation toward more physically plausible and energetically stable conformations without retraining the underlying model. We evaluate this approach using two state-of-the-art architectures, SemlaFlow, a flow matching model and EDM, a diffusion model, on the PDBBind dataset. Across both models, energy guidance improves enthalpic interaction energy, improves strain energy by up to 75<span>\\(\\%\\)</span>, and generates over 1000 ligands with better docking scores than native ligands. These results demonstrate that lightweight, physics-based guidance can significantly enhance generative drug design while preserving chemical validity and diversity.</p><p>We introduce a novel, <i>training-free force field guidance</i> framework that steers ligand generation using empirical molecular mechanics (e.g., MMFF94) during diffusion or flow-based sampling–without modifying or retraining the base generative model (e.g., EDM or Semflaflow by [24]). Our method operates as a plug-in during inference time, leveraging energy feedback to generate poses with lower strain and having better predicted interactions with the protein structure.</p><p>Our main contributions are as follows:</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"18 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2026-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1186/s13321-026-01198-2.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-026-01198-2","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/5/3 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Generative models based on diffusion and flow matching have recently been applied to structure-based drug design, but their outputs often include unrealistic protein–ligand interactions that do not obey the laws of physics. We present an energy guidance framework that incorporates a molecular mechanics force field (MMFF94) directly into the sampling process. The method steers molecular generation toward more physically plausible and energetically stable conformations without retraining the underlying model. We evaluate this approach using two state-of-the-art architectures, SemlaFlow, a flow matching model and EDM, a diffusion model, on the PDBBind dataset. Across both models, energy guidance improves enthalpic interaction energy, improves strain energy by up to 75\(\%\), and generates over 1000 ligands with better docking scores than native ligands. These results demonstrate that lightweight, physics-based guidance can significantly enhance generative drug design while preserving chemical validity and diversity.

We introduce a novel, training-free force field guidance framework that steers ligand generation using empirical molecular mechanics (e.g., MMFF94) during diffusion or flow-based sampling–without modifying or retraining the base generative model (e.g., EDM or Semflaflow by [24]). Our method operates as a plug-in during inference time, leveraging energy feedback to generate poses with lower strain and having better predicted interactions with the protein structure.

Our main contributions are as follows:

Abstract Image

利用力场引导改进蛋白质-配体复合物的生成。
基于扩散和流动匹配的生成模型最近被应用于基于结构的药物设计,但它们的输出通常包括不符合物理定律的不切实际的蛋白质-配体相互作用。我们提出了一个能量引导框架,将分子力学力场(MMFF94)直接纳入采样过程。该方法将分子生成转向物理上更合理和能量上更稳定的构象,而无需重新训练基础模型。我们在PDBBind数据集上使用两种最先进的架构SemlaFlow(流匹配模型)和EDM(扩散模型)来评估这种方法。在这两种模型中,能量引导提高了焓相互作用能,将应变能提高了75%,并产生了超过1000个配体,比天然配体的对接分数更好。这些结果表明,轻量的、基于物理的指导可以显著增强生殖药物设计,同时保持化学有效性和多样性。科学贡献:我们引入了一种新颖的、无需训练的力场引导框架,该框架在扩散或基于流动的采样过程中使用经验分子力学(例如MMFF94)来引导配体的生成,而无需修改或重新训练基础生成模型(例如EDM或semflflow)。我们的方法在推理时间内作为插件运行,利用能量反馈来生成具有较低应变的姿态,并且更好地预测与蛋白质结构的相互作用。我们的主要贡献如下:无需再训练的基于能量的指导:与需要神经亲和预测器梯度的方法(例如,BADGER[26])不同,我们的方法直接在后验采样步骤中注入经典力场反馈(MMFF94)。改进对接和应变指标:在针对无条件EDM和semflflow的基准测试中,即使在使用相同的力场优化最终结构之后,我们的引导推理也始终产生更好的AutoDock Vina分数和更低的配体应变能。兼容性和灵活性:由于制导模块是外部的,它可以广泛应用于多个生成主干,无需再训练或架构修改,并且可以应用于任意可微的势能函数。稳定性的理论保证。我们在附录B中证明,在标准平滑假设下,梯度校正步骤对应于能量的下降步骤。虽然完整的采样更新还包括模型驱动(在扩散情况下,随机)组件,但该结果形式化了引导项如何在局部将轨迹偏向低能区域,并为其稳定效果提供了原则性的理由。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Cheminformatics
Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS
CiteScore
14.10
自引率
7.00%
发文量
82
审稿时长
3 months
期刊介绍: Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书