Yaxing Wang, Xiang Liu, Yipeng Zhang, Xiangjun Wang and Kelin Xia*,
{"title":"Join Persistent Homology (JPH)-Based Machine Learning for Metalloprotein–Ligand Binding Affinity Prediction","authors":"Yaxing Wang, Xiang Liu, Yipeng Zhang, Xiangjun Wang and Kelin Xia*, ","doi":"10.1021/acs.jcim.4c0230910.1021/acs.jcim.4c02309","DOIUrl":null,"url":null,"abstract":"<p >With the crucial role of metalloproteins in respiration, oxidative stress protection, photosynthesis, and drug metabolism, the design and discovery of drugs that can target metalloproteins are extremely important. Recently, enormous potential has been shown by topological data analysis (TDA) and TDA-based machine learning models in various steps of drug design and discovery. Here, we propose, for the first time, join persistent homology (JPH) and JPH-based machine learning models for metalloprotein–ligand binding affinity prediction. Mathematically, dramatically different from persistent homology and extended persistent homology, our JPH employs a set of filtration functions to generate a multistage filtration for the join of the original simplicial complex and a specially designed test simplicial complex. From the featurization perspective, our JPH-based molecular descriptors can provide a more comprehensive characterization of the intrinsic topological information of the data. Our JPH descriptors are combined with the gradient boosting tree (GBT) model for metalloprotein–ligand binding affinity prediction. The benchmark dataset for metalloprotein–ligand complexes from PDBbind-v2020 is employed for the validation and comparison of our model. It has been found that our JPH-GBT model can outperform all of the existing models, as far as we know. This demonstrates the great potential of our join persistent homology in the characterization of molecular structures and functions.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"65 6","pages":"2785–2793 2785–2793"},"PeriodicalIF":5.6000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jcim.4c02309","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0
Abstract
With the crucial role of metalloproteins in respiration, oxidative stress protection, photosynthesis, and drug metabolism, the design and discovery of drugs that can target metalloproteins are extremely important. Recently, enormous potential has been shown by topological data analysis (TDA) and TDA-based machine learning models in various steps of drug design and discovery. Here, we propose, for the first time, join persistent homology (JPH) and JPH-based machine learning models for metalloprotein–ligand binding affinity prediction. Mathematically, dramatically different from persistent homology and extended persistent homology, our JPH employs a set of filtration functions to generate a multistage filtration for the join of the original simplicial complex and a specially designed test simplicial complex. From the featurization perspective, our JPH-based molecular descriptors can provide a more comprehensive characterization of the intrinsic topological information of the data. Our JPH descriptors are combined with the gradient boosting tree (GBT) model for metalloprotein–ligand binding affinity prediction. The benchmark dataset for metalloprotein–ligand complexes from PDBbind-v2020 is employed for the validation and comparison of our model. It has been found that our JPH-GBT model can outperform all of the existing models, as far as we know. This demonstrates the great potential of our join persistent homology in the characterization of molecular structures and functions.
期刊介绍:
The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery.
Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field.
As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.