Tianjun Xie, Gerhard R Wittreich, Matthew T Curnan, Geun Ho Gu, Kayla N Seals, Justin S Tolbert
{"title":"Machine-Learning-Enabled Thermochemistry Estimator.","authors":"Tianjun Xie, Gerhard R Wittreich, Matthew T Curnan, Geun Ho Gu, Kayla N Seals, Justin S Tolbert","doi":"10.1021/acs.jcim.4c00989","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c00989","url":null,"abstract":"<p><p>Modeling adsorbates on single-crystal metals is critical in rational catalyst design and other research that requires detailed thermochemistry. First-principles simulations via density functional theory (DFT) are among the prevalent tools to acquire such information about surface species. While they are highly dependable, DFT calculations often require intensive computational resources and runtime. These limiting factors become particularly pronounced when investigating large sets of complex molecules on heavy noble metals. Consequently, our ability to explore these species and their corresponding energetics is limited. In this work, we establish a novel framework that utilizes techniques including molecular encoding, descriptor synthesis, and machine learning to overcome the limitation of costly DFT simulations. Simultaneously, we estimate thermochemical information efficiently at the DFT accuracy level. More specifically, we translated our training molecules into text-based identifiers through a simplified molecular-input line-entry system. Following that, we parametrize our training matrices with sets of short-range descriptors based on group methods, applying first the nearest neighbors to account for linear contributions. This is coupled with the long-range descriptors characterizing second nearest neighbors to account for nonlinear corrections. Finally, we use linear regression and machine learning techniques, such as Gaussian process regressions to regress over the linear and nonlinear matrix systems, respectively. This is the first work to our knowledge that encompasses both the first and second nearest neighbors based on the group theory throughout the featurization, training, and deployment stages. We trained and validated our models with 459 surface species on Pt(111), Ru(0001), and Ir(111) surfaces. Results exhibit robust performance to reproduce the energetics of interest, such as enthalpies, entropies, and heat capacities, at various temperatures. Notably, the mean absolute errors can be reduced by 48% during training and 19% during prediction at a minimum, when compared to the classical group method. Leveraging the novel framework, our machine-learning-enabled thermochemistry estimator significantly empowers us to research the thermochemistry of complex species on metal catalysts.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142833085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aravind Selvaram Thirunavukarasu, Katarzyna Szleper, Gamze Tanriver, Igor Marchlewski, Karolina Mitusinska, Artur Gora, Jan Brezovsky
{"title":"Water Migration through Enzyme Tunnels Is Sensitive to the Choice of Explicit Water Model.","authors":"Aravind Selvaram Thirunavukarasu, Katarzyna Szleper, Gamze Tanriver, Igor Marchlewski, Karolina Mitusinska, Artur Gora, Jan Brezovsky","doi":"10.1021/acs.jcim.4c01177","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01177","url":null,"abstract":"<p><p>The utilization of tunnels and water transport within enzymes is crucial for their catalytic function as water molecules can stabilize bound substrates and help with unbinding processes of products and inhibitors. Since the choice of water models for molecular dynamics simulations was shown to determine the accuracy of various calculated properties of the bulk solvent and solvated proteins, we have investigated if and to what extent water transport through the enzyme tunnels depends on the selection of the water model. Here, we focused on simulating enzymes with various well-defined tunnel geometries. In a systematic investigation using haloalkane dehalogenase as a model system, we focused on the well-established TIP3P, OPC, and TIP4P-Ew water models to explore their impact on the use of tunnels for water molecule transport. The TIP3P water model showed significantly faster migration, resulting in the transport of approximately 2.5 times more water molecules compared to that of the OPC and 1.7 times greater than that of the TIP4P-Ew. Finally, the transport was 1.4-fold more pronounced in TIP4P-Ew than in OPC. The increase in migration of TIP3P water molecules was mainly due to faster transit times through dehalogenase tunnels. We observed similar behavior in two different enzymes with buried active sites and different tunnel network topologies, i.e., alditol oxidase and cytochrome P450, indicating that our findings are likely not restricted to a particular enzyme family. Overall, this study showcases the critical importance of water models in comprehending the use of enzyme tunnels for small molecule transport. Given the significant role of water availability in various stages of the catalytic cycle and the solvation of substrates, products, and drugs, choosing an appropriate water model may be crucial for accurate simulations of complex enzymatic reactions, rational enzyme design, and predicting drug residence times.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142826695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Property Prediction for Complex Compounds Using Structure-Free Mendeleev Encoding and Machine Learning","authors":"Zixin Zhuang, and , Amanda S. Barnard*, ","doi":"10.1021/acs.jcim.4c0134310.1021/acs.jcim.4c01343","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01343https://doi.org/10.1021/acs.jcim.4c01343","url":null,"abstract":"<p >Predicting the properties for unseen materials exclusively on the basis of the chemical formula before synthesis and characterization has advantages for research and resource planning. This can be achieved using suitable structure-free encoding and machine learning methods, but additional processing decisions are required. In this study, we compare a variety of structure-free materials encodings and machine learning algorithms to predict the structure/property relationships of battery materials. It was found that the physical units used to measure the property labels have an important impact on the predictive ability of the models, regardless of the computational approach. Property labels with respect to weight give excellent performance, but property labels with respect to volume cannot be predicted with confidence using only chemical information, even when the underlying physical characteristics are the same. These results contrast with previous studies of unsupervised learning and classification, where structure-free encoding excelled, and highlight how the structural features or property labels of materials are represented plays an important role in the predictive ability of machine learning models.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 24","pages":"9205–9214 9205–9214"},"PeriodicalIF":5.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142870168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MacGen: A Web Server for Structure-Based Macrocycle Design","authors":"Zhihan Zhang, Dongliang Ke, Chengshan Jin, Weiyu Zhou, Xiaolin Pan, Yueqing Zhang, Xingyu Wang, Xudong Xiao and Changge Ji*, ","doi":"10.1021/acs.jcim.4c0157610.1021/acs.jcim.4c01576","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01576https://doi.org/10.1021/acs.jcim.4c01576","url":null,"abstract":"<p >Macrocyclization is a critical strategy in rational drug design that can offer several advantages, such as enhancing binding affinity, increasing selectivity, and improving cellular permeability. Herein, we introduce MacGen, a web tool devised for structure-based macrocycle design. MacGen identifies exit vector pairs within a ligand that are suitable for cyclization and finds 3D linkers that can align with the geometric arrangement of these pairs to form macrocycles. To aid in the fast acquisition of appropriate linkers, we have built an indexed 3D linker database that includes linkers of various lengths and categories. MacGen provides comprehensive configurable parameters that enable users to obtain preferred linkers, meeting unique requirements in practical ligand design scenarios. We hope MacGen will serve as a handy tool that can rapidly explore potential macrocycle space. The MacGen server is freely accessible at https://macgen.xundrug.cn.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 24","pages":"9048–9055 9048–9055"},"PeriodicalIF":5.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142870161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander Kravberg, Didier Devaurs*, Anastasiia Varava, Lydia E. Kavraki and Danica Kragic*,
{"title":"MoleQCage: Geometric High-Throughput Screening for Molecular Caging Prediction","authors":"Alexander Kravberg, Didier Devaurs*, Anastasiia Varava, Lydia E. Kavraki and Danica Kragic*, ","doi":"10.1021/acs.jcim.4c0141910.1021/acs.jcim.4c01419","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01419https://doi.org/10.1021/acs.jcim.4c01419","url":null,"abstract":"<p >Although being able to determine whether a host molecule can enclose a guest molecule and form a caging complex could benefit numerous chemical and medical applications, the experimental discovery of molecular caging complexes has not yet been achieved at scale. Here, we propose MoleQCage, a simple tool for the high-throughput screening of host and guest candidates based on an efficient robotics-inspired geometric algorithm for molecular caging prediction, providing theoretical guarantees and robustness assessment. MoleQCage is distributed as Linux-based software with a graphical user interface and is available online at https://hub.docker.com/r/dantrigne/moleqcage in the form of a Docker container. Documentation and examples are available as Supporting Information and online at https://hub.docker.com/r/dantrigne/moleqcage.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 24","pages":"9034–9039 9034–9039"},"PeriodicalIF":5.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acs.jcim.4c01419","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142870166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Apoorva Mathur, Rikhia Ghosh* and Ariane Nunes-Alves*,
{"title":"Recent Progress in Modeling and Simulation of Biomolecular Crowding and Condensation Inside Cells","authors":"Apoorva Mathur, Rikhia Ghosh* and Ariane Nunes-Alves*, ","doi":"10.1021/acs.jcim.4c0152010.1021/acs.jcim.4c01520","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01520https://doi.org/10.1021/acs.jcim.4c01520","url":null,"abstract":"<p >Macromolecular crowding in the cellular cytoplasm can potentially impact diffusion rates of proteins, their intrinsic structural stability, binding of proteins to their corresponding partners as well as biomolecular organization and phase separation. While such intracellular crowding can have a large impact on biomolecular structure and function, the molecular mechanisms and driving forces that determine the effect of crowding on dynamics and conformations of macromolecules are so far not well understood. At a molecular level, computational methods can provide a unique lens to investigate the effect of macromolecular crowding on biomolecular behavior, providing us with a resolution that is challenging to reach with experimental techniques alone. In this review, we focus on the various physics-based and data-driven computational methods developed in the past few years to investigate macromolecular crowding and intracellular protein condensation. We review recent progress in modeling and simulation of biomolecular systems of varying sizes, ranging from single protein molecules to the entire cellular cytoplasm. We further discuss the effects of macromolecular crowding on different phenomena, such as diffusion, protein–ligand binding, and mechanical and viscoelastic properties, such as surface tension of condensates. Finally, we discuss some of the outstanding challenges that we anticipate the community addressing in the next few years in order to investigate biological phenomena in model cellular environments by reproducing <i>in vivo</i> conditions as accurately as possible.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 24","pages":"9063–9081 9063–9081"},"PeriodicalIF":5.6,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acs.jcim.4c01520","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142870156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francisco L. Feitosa, Victoria F. Cabral, Igor H. Sanches, Sabrina Silva-Mendonca, Joyce V. V. B. Borba, Rodolpho C. Braga and Carolina Horta Andrade*,
{"title":"Cyto-Safe: A Machine Learning Tool for Early Identification of Cytotoxic Compounds in Drug Discovery","authors":"Francisco L. Feitosa, Victoria F. Cabral, Igor H. Sanches, Sabrina Silva-Mendonca, Joyce V. V. B. Borba, Rodolpho C. Braga and Carolina Horta Andrade*, ","doi":"10.1021/acs.jcim.4c0181110.1021/acs.jcim.4c01811","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01811https://doi.org/10.1021/acs.jcim.4c01811","url":null,"abstract":"<p >Cytotoxicity is essential in drug discovery, enabling early evaluation of toxic compounds during screenings to minimize toxicological risks. <i>In vitro</i> assays support high-throughput screening, allowing for efficient detection of toxic substances while considerably reducing the need for animal testing. Additionally, AI-based Quantitative Structure–Activity Relationship (AI-QSAR) models enhance early stage predictions by assessing the cytotoxic potential of molecular structures, which helps prioritize low-risk compounds for further validation. We present a freely accessible web application designed for identifying potential cytotoxic compounds utilizing QSAR models. This application utilizes machine learning techniques and is built on a data set of approximately 90,000 compounds, evaluated against two cell lines, 3T3 and HEK 293. Users can interact with the app by inputting a SMILES representation, uploading CSV or SDF files, or sketching molecules. The output includes a binary prediction for each cell line, a confidence percentage, and an explainable AI (XAI) analysis. Cyto-Safe web-app version 1.0 is available at http://insightai.labmol.com.br/.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 24","pages":"9056–9062 9056–9062"},"PeriodicalIF":5.6,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acs.jcim.4c01811","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142874951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Negin Forouzesh, Fatemeh Ghafouri, Igor S. Tolokh and Alexey V. Onufriev*,
{"title":"Optimal Dielectric Boundary for Binding Free Energy Estimates in the Implicit Solvent","authors":"Negin Forouzesh, Fatemeh Ghafouri, Igor S. Tolokh and Alexey V. Onufriev*, ","doi":"10.1021/acs.jcim.4c0119010.1021/acs.jcim.4c01190","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01190https://doi.org/10.1021/acs.jcim.4c01190","url":null,"abstract":"<p >Accuracy of binding free energy calculations utilizing implicit solvent models is critically affected by parameters of the underlying dielectric boundary, specifically, the atomic and water probe radii. Here, a multidimensional optimization pipeline is used to find optimal atomic radii, specifically for binding calculations in the implicit solvent. To reduce overfitting, the optimization target includes separate, weighted contributions from both binding and hydration free energies. The resulting five-parameter radii set, OPT_BIND5D, is evaluated against experiment for binding free energies of 20 host–guest (H–G) systems, unrelated to the types of structures used in the training. The resulting accuracy for this H–G test set (root mean square error of 2.03 kcal/mol, mean signed error of −0.13 kcal/mol, mean absolute error of 1.68 kcal/mol, and Pearson’s correlation of <i>r</i> = 0.79 with the experimental values) is on par with what can be expected from the fixed charge explicit solvent models. Best agreement with the experiment is achieved when the implicit salt concentration is set equal or close to the experimental conditions.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 24","pages":"9433–9448 9433–9448"},"PeriodicalIF":5.6,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acs.jcim.4c01190","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142870151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Extended Warheads toward Developing Cysteine-Targeted Covalent Kinase Inhibitors","authors":"Zheng Zhao*, and , Philip E. Bourne*, ","doi":"10.1021/acs.jcim.4c0089010.1021/acs.jcim.4c00890","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c00890https://doi.org/10.1021/acs.jcim.4c00890","url":null,"abstract":"<p >In designing covalent kinase inhibitors (CKIs), the inclusion of electrophiles as attacking warheads demands careful choreography, ensuring not only their presence on the scaffold moiety but also their precise interaction with nucleophiles in the binding sites. Given the limited number of known electrophiles, exploring adjacent chemical space to broaden the palette of available electrophiles capable of covalent inhibition is desirable. Here, we systematically analyze the characteristics of warheads and the corresponding adjacent fragments for use in CKI design. We first collect all the released cysteine-targeted CKIs from multiple databases and create one CKI data set containing 16,961 kinase-inhibitor data points from 12,381 unique CKIs covering 146 kinases with accessible cysteines in their binding pockets. Then, we analyze this data set, focusing on the extended warheads (i.e., warheads + adjacent fragments)─including 30 common warheads and 1344 unique adjacent fragments. In so doing, we provide structural insights and delineate chemical properties and patterns in these extended warheads. Notably, we highlight the popular patterns observed within reversible CKIs for the popular warheads cyanoacrylamide and aldehyde. This study provides medicinal chemists with novel insights into extended warheads and a comprehensive source of adjacent fragments, thus guiding the design, synthesis, and optimization of CKIs.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 24","pages":"9517–9527 9517–9527"},"PeriodicalIF":5.6,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acs.jcim.4c00890","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142874981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martina Piga, Zoltan Varga, Adam Feher, Ferenc Papp, Eva Korpos, Kavya C. Bangera, Rok Frlan, Janez Ilaš, Jaka Dernovšek, Tihomir Tomašič and Nace Zidar*,
{"title":"Correction to “Identification of a Novel Structural Class of HV1 Inhibitors by Structure-Based Virtual Screening”","authors":"Martina Piga, Zoltan Varga, Adam Feher, Ferenc Papp, Eva Korpos, Kavya C. Bangera, Rok Frlan, Janez Ilaš, Jaka Dernovšek, Tihomir Tomašič and Nace Zidar*, ","doi":"10.1021/acs.jcim.4c0221110.1021/acs.jcim.4c02211","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02211https://doi.org/10.1021/acs.jcim.4c02211","url":null,"abstract":"","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 24","pages":"9651 9651"},"PeriodicalIF":5.6,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acs.jcim.4c02211","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142874987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}