Accurate and Rapid Ranking of Protein–Ligand Binding Affinities Using Density Matrix Fragmentation and Physics-Informed Machine Learning Dispersion Potentials

IF 2.2 3区 化学 Q3 CHEMISTRY, PHYSICAL
Ka Un Lao, Danyang Wang
{"title":"Accurate and Rapid Ranking of Protein–Ligand Binding Affinities Using Density Matrix Fragmentation and Physics-Informed Machine Learning Dispersion Potentials","authors":"Ka Un Lao,&nbsp;Danyang Wang","doi":"10.1002/cphc.202500094","DOIUrl":null,"url":null,"abstract":"<p>The generalized many-body expansion for building density matrices (GMBE-DM), truncated at the one-body level and combined with a purification scheme, is applied to rank protein–ligand binding affinities across two cyclin-dependent kinase 2 (CDK2) datasets and one Janus kinase 1 (JAK1) dataset, totaling 28 ligands. This quantum fragmentation-based method achieves strong correlation with experimental binding free energies (<i>R</i><sup>2</sup> = 0.84), while requiring less than 5 min per complex without extensive parallelization, making it highly efficient for rapid drug screening and lead prioritization. In addition, our physics-informed, machine learning-corrected dispersion potential (D3-ML) demonstrates even stronger ranking performance (<i>R</i><sup>2</sup> = 0.87), effectively capturing binding trends through favorable cancelation of non-dispersion, solvation, and entropic contributions, emphasizing the central role of dispersion interactions in protein–ligand binding. With sub-second runtime per complex, D3-ML offers exceptional speed and accuracy, making it ideally suited for high-throughput virtual screening. By comparison, the deep learning model Sfcnn shows lower transferability across datasets (<i>R</i><sup>2</sup> = 0.57), highlighting the limitations of broadly trained neural networks in chemically diverse systems. Together, these results establish GMBE-DM and D3-ML as robust and scalable tools for protein–ligand affinity ranking, with D3-ML emerging as a particularly promising candidate for large-scale applications in drug discovery.</p>","PeriodicalId":9819,"journal":{"name":"Chemphyschem","volume":"26 19","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://chemistry-europe.onlinelibrary.wiley.com/doi/epdf/10.1002/cphc.202500094","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemphyschem","FirstCategoryId":"92","ListUrlMain":"https://chemistry-europe.onlinelibrary.wiley.com/doi/10.1002/cphc.202500094","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

Abstract

The generalized many-body expansion for building density matrices (GMBE-DM), truncated at the one-body level and combined with a purification scheme, is applied to rank protein–ligand binding affinities across two cyclin-dependent kinase 2 (CDK2) datasets and one Janus kinase 1 (JAK1) dataset, totaling 28 ligands. This quantum fragmentation-based method achieves strong correlation with experimental binding free energies (R2 = 0.84), while requiring less than 5 min per complex without extensive parallelization, making it highly efficient for rapid drug screening and lead prioritization. In addition, our physics-informed, machine learning-corrected dispersion potential (D3-ML) demonstrates even stronger ranking performance (R2 = 0.87), effectively capturing binding trends through favorable cancelation of non-dispersion, solvation, and entropic contributions, emphasizing the central role of dispersion interactions in protein–ligand binding. With sub-second runtime per complex, D3-ML offers exceptional speed and accuracy, making it ideally suited for high-throughput virtual screening. By comparison, the deep learning model Sfcnn shows lower transferability across datasets (R2 = 0.57), highlighting the limitations of broadly trained neural networks in chemically diverse systems. Together, these results establish GMBE-DM and D3-ML as robust and scalable tools for protein–ligand affinity ranking, with D3-ML emerging as a particularly promising candidate for large-scale applications in drug discovery.

Abstract Image

利用密度矩阵碎片和物理信息的机器学习分散势准确快速地排列蛋白质-配体结合亲和力。
基于广义多体扩展的构建密度矩阵(GMBE-DM),在单体水平截断并结合纯化方案,应用于两个周期蛋白依赖性激酶2 (CDK2)数据集和一个Janus激酶1 (JAK1)数据集的蛋白质-配体结合亲和力排序,共28个配体。这种基于量子碎片的方法与实验结合自由能具有很强的相关性(R2 = 0.84),并且每个配合物所需时间小于5分钟,无需大量并行化,因此可以高效地进行药物快速筛选和先导物优先排序。此外,我们的物理信息、机器学习校正的弥散势(D3-ML)显示出更强的排名性能(R2 = 0.87),通过有利地抵消非弥散、溶剂化和熵贡献,有效地捕获了结合趋势,强调了弥散相互作用在蛋白质-配体结合中的核心作用。D3-ML每个复合物的运行时间低于秒,提供了卓越的速度和准确性,使其非常适合高通量虚拟筛选。相比之下,深度学习模型Sfcnn在数据集之间的可转移性较低(R2 = 0.57),突出了广泛训练的神经网络在化学多样性系统中的局限性。总之,这些结果确立了GMBE-DM和D3-ML作为蛋白质配体亲和力排序的强大且可扩展的工具,其中D3-ML在药物发现的大规模应用中成为特别有希望的候选者。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Chemphyschem
Chemphyschem 化学-物理:原子、分子和化学物理
CiteScore
4.60
自引率
3.40%
发文量
425
审稿时长
1.1 months
期刊介绍: ChemPhysChem is one of the leading chemistry/physics interdisciplinary journals (ISI Impact Factor 2018: 3.077) for physical chemistry and chemical physics. It is published on behalf of Chemistry Europe, an association of 16 European chemical societies. ChemPhysChem is an international source for important primary and critical secondary information across the whole field of physical chemistry and chemical physics. It integrates this wide and flourishing field ranging from Solid State and Soft-Matter Research, Electro- and Photochemistry, Femtochemistry and Nanotechnology, Complex Systems, Single-Molecule Research, Clusters and Colloids, Catalysis and Surface Science, Biophysics and Physical Biochemistry, Atmospheric and Environmental Chemistry, and many more topics. ChemPhysChem is peer-reviewed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信