{"title":"利用密度矩阵碎片和物理信息的机器学习分散势准确快速地排列蛋白质-配体结合亲和力。","authors":"Ka Un Lao, Danyang Wang","doi":"10.1002/cphc.202500094","DOIUrl":null,"url":null,"abstract":"<p>The generalized many-body expansion for building density matrices (GMBE-DM), truncated at the one-body level and combined with a purification scheme, is applied to rank protein–ligand binding affinities across two cyclin-dependent kinase 2 (CDK2) datasets and one Janus kinase 1 (JAK1) dataset, totaling 28 ligands. This quantum fragmentation-based method achieves strong correlation with experimental binding free energies (<i>R</i><sup>2</sup> = 0.84), while requiring less than 5 min per complex without extensive parallelization, making it highly efficient for rapid drug screening and lead prioritization. In addition, our physics-informed, machine learning-corrected dispersion potential (D3-ML) demonstrates even stronger ranking performance (<i>R</i><sup>2</sup> = 0.87), effectively capturing binding trends through favorable cancelation of non-dispersion, solvation, and entropic contributions, emphasizing the central role of dispersion interactions in protein–ligand binding. With sub-second runtime per complex, D3-ML offers exceptional speed and accuracy, making it ideally suited for high-throughput virtual screening. By comparison, the deep learning model Sfcnn shows lower transferability across datasets (<i>R</i><sup>2</sup> = 0.57), highlighting the limitations of broadly trained neural networks in chemically diverse systems. Together, these results establish GMBE-DM and D3-ML as robust and scalable tools for protein–ligand affinity ranking, with D3-ML emerging as a particularly promising candidate for large-scale applications in drug discovery.</p>","PeriodicalId":9819,"journal":{"name":"Chemphyschem","volume":"26 19","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://chemistry-europe.onlinelibrary.wiley.com/doi/epdf/10.1002/cphc.202500094","citationCount":"0","resultStr":"{\"title\":\"Accurate and Rapid Ranking of Protein–Ligand Binding Affinities Using Density Matrix Fragmentation and Physics-Informed Machine Learning Dispersion Potentials\",\"authors\":\"Ka Un Lao, Danyang Wang\",\"doi\":\"10.1002/cphc.202500094\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The generalized many-body expansion for building density matrices (GMBE-DM), truncated at the one-body level and combined with a purification scheme, is applied to rank protein–ligand binding affinities across two cyclin-dependent kinase 2 (CDK2) datasets and one Janus kinase 1 (JAK1) dataset, totaling 28 ligands. This quantum fragmentation-based method achieves strong correlation with experimental binding free energies (<i>R</i><sup>2</sup> = 0.84), while requiring less than 5 min per complex without extensive parallelization, making it highly efficient for rapid drug screening and lead prioritization. In addition, our physics-informed, machine learning-corrected dispersion potential (D3-ML) demonstrates even stronger ranking performance (<i>R</i><sup>2</sup> = 0.87), effectively capturing binding trends through favorable cancelation of non-dispersion, solvation, and entropic contributions, emphasizing the central role of dispersion interactions in protein–ligand binding. With sub-second runtime per complex, D3-ML offers exceptional speed and accuracy, making it ideally suited for high-throughput virtual screening. By comparison, the deep learning model Sfcnn shows lower transferability across datasets (<i>R</i><sup>2</sup> = 0.57), highlighting the limitations of broadly trained neural networks in chemically diverse systems. Together, these results establish GMBE-DM and D3-ML as robust and scalable tools for protein–ligand affinity ranking, with D3-ML emerging as a particularly promising candidate for large-scale applications in drug discovery.</p>\",\"PeriodicalId\":9819,\"journal\":{\"name\":\"Chemphyschem\",\"volume\":\"26 19\",\"pages\":\"\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-08-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://chemistry-europe.onlinelibrary.wiley.com/doi/epdf/10.1002/cphc.202500094\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chemphyschem\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://chemistry-europe.onlinelibrary.wiley.com/doi/10.1002/cphc.202500094\",\"RegionNum\":3,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemphyschem","FirstCategoryId":"92","ListUrlMain":"https://chemistry-europe.onlinelibrary.wiley.com/doi/10.1002/cphc.202500094","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
Accurate and Rapid Ranking of Protein–Ligand Binding Affinities Using Density Matrix Fragmentation and Physics-Informed Machine Learning Dispersion Potentials
The generalized many-body expansion for building density matrices (GMBE-DM), truncated at the one-body level and combined with a purification scheme, is applied to rank protein–ligand binding affinities across two cyclin-dependent kinase 2 (CDK2) datasets and one Janus kinase 1 (JAK1) dataset, totaling 28 ligands. This quantum fragmentation-based method achieves strong correlation with experimental binding free energies (R2 = 0.84), while requiring less than 5 min per complex without extensive parallelization, making it highly efficient for rapid drug screening and lead prioritization. In addition, our physics-informed, machine learning-corrected dispersion potential (D3-ML) demonstrates even stronger ranking performance (R2 = 0.87), effectively capturing binding trends through favorable cancelation of non-dispersion, solvation, and entropic contributions, emphasizing the central role of dispersion interactions in protein–ligand binding. With sub-second runtime per complex, D3-ML offers exceptional speed and accuracy, making it ideally suited for high-throughput virtual screening. By comparison, the deep learning model Sfcnn shows lower transferability across datasets (R2 = 0.57), highlighting the limitations of broadly trained neural networks in chemically diverse systems. Together, these results establish GMBE-DM and D3-ML as robust and scalable tools for protein–ligand affinity ranking, with D3-ML emerging as a particularly promising candidate for large-scale applications in drug discovery.
期刊介绍:
ChemPhysChem is one of the leading chemistry/physics interdisciplinary journals (ISI Impact Factor 2018: 3.077) for physical chemistry and chemical physics. It is published on behalf of Chemistry Europe, an association of 16 European chemical societies.
ChemPhysChem is an international source for important primary and critical secondary information across the whole field of physical chemistry and chemical physics. It integrates this wide and flourishing field ranging from Solid State and Soft-Matter Research, Electro- and Photochemistry, Femtochemistry and Nanotechnology, Complex Systems, Single-Molecule Research, Clusters and Colloids, Catalysis and Surface Science, Biophysics and Physical Biochemistry, Atmospheric and Environmental Chemistry, and many more topics. ChemPhysChem is peer-reviewed.