{"title":"In silico design of dehydrophenylalanine containing peptide activators of glucokinase using pharmacophore modelling, molecular dynamics and machine learning: implications in type 2 diabetes","authors":"Siddharth Yadav, Swati Rana, Manish Manish, Sohini Singh, Andrew Lynn, Puniti Mathur","doi":"10.1007/s10822-024-00583-z","DOIUrl":"10.1007/s10822-024-00583-z","url":null,"abstract":"<div><p>Diabetes represents a significant global health challenge associated with substantial healthcare costs and therapeutic complexities. Current diabetes therapies often entail adverse effects, necessitating the exploration of novel agents. Glucokinase (GK), a key enzyme in glucose homeostasis, primarily regulates blood glucose levels in hepatocytes and pancreatic cells. Unlike other hexokinases, GK exhibits unique kinetic properties, such as a high Km and lack of feedback inhibition, allowing it to function as a glucose sensor Glucokinase activators (GKAs) have emerged as promising candidates for managing type-2 diabetes by allosterically enhancing GK activity. Despite initial promise, existing GKAs face significant safety concerns, driving the need for compounds with improved safety profiles. This study introduces a novel chemical scaffold within the GKA landscape: peptide-based GKAs incorporating non-standard amino acid residues such as α,β-dehydrophenylalanine (ΔPhe/ΔF). A virtual library containing 3,368,000 peptides was constructed and screened using a hybrid pharmacophore, namely DHRR (D: donor; H: hydrogen; R: aromatic ring). Molecular docking and molecular dynamics simulations assisted in identifying three peptides, Pep-11, Pep-15, and Pep-16, which depicted stable binding at the allosteric site of Glucokinase. These peptides were synthesized using a combination of solid and solution phase synthesis methods. In vitro enzymatic activity of glucokinase was increased by at least 1.5 times in the presence of these peptides. Several machine learning algorithms were explored as alternatives to conventional in-silico methods for predicting GK activity. Regression and tree-based algorithms outperformed other methods, with the logistic regression and random forest classifiers both achieving an ROC-AUC of 0.98.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"39 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142906101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ConoDL: a deep learning framework for rapid generation and prediction of conotoxins","authors":"Menghan Guo, Zengpeng Li, Xuejin Deng, Ding Luo, Jingyi Yang, Yingjun Chen, Weiwei Xue","doi":"10.1007/s10822-024-00582-0","DOIUrl":"10.1007/s10822-024-00582-0","url":null,"abstract":"<div><p>Conotoxins, being small disulfide-rich and bioactive peptides, manifest notable pharmacological potential and find extensive applications. However, the exploration of conotoxins’ vast molecular space using traditional methods is severely limited, necessitating the urgent need of developing novel approaches. Recently, deep learning (DL)-based methods have advanced to the molecular generation of proteins and peptides. Nevertheless, the limited data and the intricate structure of conotoxins constrain the application of deep learning models in the generation of conotoxins. We propose ConoDL, a framework for the generation and prediction of conotoxins, comprising the end-to-end conotoxin generation model (ConoGen) and the conotoxin prediction model (ConoPred). ConoGen employs transfer learning and a large language model (LLM) to tackle the challenges in conotoxin generation. Meanwhile, ConoPred filters artificial conotoxins generated by ConoGen, narrowing down the scope for subsequent research. A comprehensive evaluation of the peptide properties at both sequence and structure levels indicates that the artificial conotoxins generated by ConoDL exhibit a certain degree of similarity to natural conotoxins. Furthermore, ConoDL has generated artificial conotoxins with novel cysteine scaffolds. Therefore, ConoDL may uncover new cysteine scaffolds and conotoxin molecules, facilitating further exploration of the molecular space of conotoxins and the discovery of pharmacologically active variants.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"39 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142889745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MolGraph: a Python package for the implementation of molecular graphs and graph neural networks with TensorFlow and Keras","authors":"Alexander Kensert, Gert Desmet, Deirdre Cabooter","doi":"10.1007/s10822-024-00578-w","DOIUrl":"10.1007/s10822-024-00578-w","url":null,"abstract":"<div><p>Molecular machine learning (ML) has proven important for tackling various molecular problems, such as predicting molecular properties based on molecular descriptors or fingerprints. Since relatively recently, graph neural network (GNN) algorithms have been implemented for molecular ML, showing comparable or superior performance to descriptor or fingerprint-based approaches. Although various tools and packages exist to apply GNNs in molecular ML, a new GNN package, named MolGraph, was developed in this work with the motivation to create GNN model pipelines highly compatible with the TensorFlow and Keras application programming interface (API). MolGraph also implements a module to accommodate the generation of small molecular graphs, which can be passed to a GNN algorithm to solve a molecular ML problem. To validate the GNNs, benchmarking was conducted using the datasets from MoleculeNet, as well as three chromatographic retention time datasets. The benchmarking results demonstrate that the GNNs performed in line with expectations. Additionally, the GNNs proved useful for molecular identification and improved interpretability of chromatographic retention time data. MolGraph is available at https://github.com/akensert/molgraph. Installation, tutorials and implementation details can be found at https://molgraph.readthedocs.io/en/latest/.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"39 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142778406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sophia M. N. Hönig, Torben Gutermuth, Christiane Ehrt, Christian Lemmen, Matthias Rarey
{"title":"Combining crystallographic and binding affinity data towards a novel dataset of small molecule overlays","authors":"Sophia M. N. Hönig, Torben Gutermuth, Christiane Ehrt, Christian Lemmen, Matthias Rarey","doi":"10.1007/s10822-024-00581-1","DOIUrl":"10.1007/s10822-024-00581-1","url":null,"abstract":"<p>Although small molecule superposition is a standard technique in drug discovery, a rigorous performance assessment of the corresponding methods is currently challenging. Datasets in this field are sparse, small, tailored to specific applications, unavailable, or outdated. The newly developed LOBSTER set described herein offers a publicly available and method-independent dataset for benchmarking and method optimization. LOBSTER stands for “Ligand Overlays from Binding SiTe Ensemble Representatives”. All ligands were derived from the PDB in a fully automated workflow, including a ligand efficiency filter. So-called ligand ensembles were assembled by aligning identical binding sites. Thus, the ligands within the ensembles are superimposed according to their experimentally determined binding orientation and conformation. Overall, 671 representative ligand ensembles comprise 3583 ligands from 3521 proteins. Altogether, 72,734 ligand pairs based on the ensembles were grouped into ten distinct subsets based on their volume overlap, for the benefit of introducing different degrees of difficulty for evaluating superposition methods. Statistics on the physicochemical properties of the compounds indicate that the dataset represents drug-like compounds. Consensus Diversity Plots show predominantly high Bemis–Murcko scaffold diversity and low median MACCS fingerprint similarity for each ensemble. An analysis of the underlying protein classes further demonstrates the heterogeneity within our dataset. The LOBSTER set offers a variety of applications like benchmarking multiple as well as pairwise alignments, generating training and test sets, for example based on time splits, or empirical software performance evaluation studies. The LOBSTER set is publicly available at https://doi.org/10.5281/zenodo.12658320, representing a stable and versioned data resource. The Python scripts are available at https://github.com/rareylab/LOBSTER, open-source, and allow for updating or recreating superposition sets with different data sources. </p><p>Simplified illustration of the LOBSTER dataset generation.</p>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"39 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10822-024-00581-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142764979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Blumenstein, H. Dostálová, L. Rucká, V. Štěpánek, T. Busche, J. Kalinowski, M. Pátek, I. Barvík
{"title":"Promoter recognition specificity of Corynebacterium glutamicum stress response sigma factors σD and σH deciphered using computer modeling and point mutagenesis","authors":"J. Blumenstein, H. Dostálová, L. Rucká, V. Štěpánek, T. Busche, J. Kalinowski, M. Pátek, I. Barvík","doi":"10.1007/s10822-024-00577-x","DOIUrl":"10.1007/s10822-024-00577-x","url":null,"abstract":"<div><p>This study aimed to reveal interactions of the stress response sigma subunits (factors) σ<sup>D</sup> and σ<sup>H</sup> of RNA polymerase and promoters in Gram-positive bacterium <i>Corynebacterium glutamicum</i> by combining wet-lab obtained data and in silico modeling. Computer modeling-guided point mutagenesis of <i>C. glutamicum</i> σ<sup>H</sup> subunit led to the creation of a panel of σ<sup>H</sup> variants. Their ability to initiate transcription from naturally occurring hybrid σ<sup>D</sup>/σ<sup>H</sup>-dependent promoter P<i>cg0441</i> and two control canonical promoters (σ<sup>D</sup>-dependent P<i>rsdA</i> and σ<sup>H</sup>-dependent P<i>uvrD3</i>) was measured and interpreted using molecular dynamics simulations of homology models of all complexes. The results led us to design the artificial hybrid promoter P<i>D</i><sub><i>35</i></sub><i>H</i><sub><i>10</i></sub> combining the −10 element of the P<i>uvrD3</i> promoter and the −35 element of the P<i>rsdA</i> promoter. This artificial hybrid promoter P<i>D</i><sub><i>35-rsdA</i></sub><i>H</i><sub><i>10-uvrD3</i></sub> showed almost optimal properties needed for the bio-orthogonal transcription (not interfering with the native biological processes).</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"39 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10822-024-00577-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142708720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding the relationship between preferential interactions of peptides in water-acetonitrile mixtures with protein-solvent contact surface area","authors":"Monika Phougat, Narinder Singh Sahni, Devapriya Choudhury","doi":"10.1007/s10822-024-00579-9","DOIUrl":"10.1007/s10822-024-00579-9","url":null,"abstract":"<div><p>The influence of polar, water-miscible organic solvents (POS) on protein structure, stability, and functional activity is a subject of significant interest and complexity. This study examines the effects of acetonitrile (ACN), a semipolar, aprotic solvent, on the solvation properties of blocked Ace-Gly-X-Gly-Nme tripeptides (where Ace and Nme stands for acetyl and N-methyl amide groups respectively and X is any amino acid) through extensive molecular dynamics simulations. Individual simulations were conducted for each peptide, encompassing five different ACN concentrations within the range of <i>χ</i><sub>ACN</sub> = 0.1–0.9. The preferential solvation parameter (Γ) calculated using the Kirkwood-Buff integral method was used for the assessment of peptide interactions with water/ACN. Additionally, weighted Voronoi tessellation was applied to obtain a three-way data set containing four time-averaged contact surface area types between peptide atoms and water/ACN atoms. A mathematical technique known as <i>N</i>-way Partial Least Squares (NPLS) was utilized to anticipate the preferential interactions between peptides and water/ACN from the contact surface areas. Furthermore, the temperature dependency of peptide-solvent interactions was investigated using a subset of 10 amino acids representing a range of hydrophobicities. MD simulations were conducted at five temperatures, spanning from 283 to 343 K, with subsequent analysis of data focusing on both preferential solvation and peptide-solvent contact surface areas. The results demonstrate the efficacy of utilizing contact surface areas between the peptide and solvent constituents for successfully predicting preferential interactions in water/ACN mixtures across various ACN concentrations and temperatures.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"38 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142611743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of novel inhibitors targeting PI3Kα via ensemble-based virtual screening method, biological evaluation and molecular dynamics simulation","authors":"Hui Zhang, Hua-Zhao Qi, Ya-Juan Li, Xiu-Yun Shi, Mei-Ling Hu, Xiang-Long Chen, Yuan Li","doi":"10.1007/s10822-024-00580-2","DOIUrl":"10.1007/s10822-024-00580-2","url":null,"abstract":"<div><p>PIK3CA gene encoding PI3K p110α is one of the most frequently mutated and overexpressed in majority of human cancers. Development of potent and selective novel inhibitors targeting PI3Kα was considered as the most promising approaches for cancer treatment. In this investigation, a virtual screening platform for PI3Kα inhibitors was established by employing machine learning methods, pharmacophore modeling, and molecular docking approaches. 28 potential PI3Kα inhibitors with different scaffolds were selected from the databases with 295,024 compounds. Among the 28 hits, hit15 exhibited the best inhibitory effect against PI3Kα with IC<sub>50</sub> value less than 1.0 µM. The molecular dynamics simulation indicated that hit15 could stably bind to the active site of PI3Kα, interact with some residues by hydrophobic, electrostatic and hydrogen bonding interactions, and finally induced PI3Kα active pocket substantial conformation changes. Stable H-bond interactions were formed between hit15 and residues of Lys776, Asp810 and Asp933. The binding free energy of PI3Kα-hit15 was − 65.3 kJ/mol. The free energy decomposition indicated that key residues of Asp805, Ile848 and Ile932 contributed stronger energies to the binding free energy. The above results indicated that hit15 with novel scaffold was a potent PI3Kα inhibitor and considered as a promising candidate for further drug development to treat various cancers with PI3Kα over activated.</p><h3>Graphical Abstract</h3>\u0000<div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"38 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adiran Garaizar Suarez, Andreas H. Göller, Michael E. Beck, Sadra Kashef Ol Gheta, Katharina Meier
{"title":"Comparative assessment of physics-based in silico methods to calculate relative solubilities","authors":"Adiran Garaizar Suarez, Andreas H. Göller, Michael E. Beck, Sadra Kashef Ol Gheta, Katharina Meier","doi":"10.1007/s10822-024-00576-y","DOIUrl":"10.1007/s10822-024-00576-y","url":null,"abstract":"<div><p>Relative solubilities, i.e. whether a given molecule is more soluble in one solvent compared to others, is a critical parameter for pharmaceutical and agricultural formulation development and chemical synthesis, material science, and environmental chemistry. In silico predictions of this crucial variable can help reducing experiments, waste of solvents and synthesis optimization. In this study, we evaluate the performance of different physics-based methods for predicting relative solubilities. Our assessment involves quantum mechanics-based COSMO-RS and molecular dynamics-based free energy methods using OPLS4, the open-source OpenFF Sage, and GAFF force fields, spanning over 200 solvent–solute combinations. Our investigation highlights the important role of compound multimerization, an effect which must be accounted for to obtain accurate relative solubility predictions. The performance landscape of these methods is varied, with significant differences in precision depending on both the method used and the solute considered, thereby offering an improved understanding of the predictive power of physics-based methods in chemical research.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"38 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computational Identification and Illustrative Standard for Representation of Unimolecular G-Quadruplex Secondary Structures (CIIS-GQ)","authors":"Tugay Direk, Osman Doluca","doi":"10.1007/s10822-024-00573-1","DOIUrl":"10.1007/s10822-024-00573-1","url":null,"abstract":"<div><p>G-quadruplexes refer to a large group of nucleic acid–based structures. In recent years, they have been attracting attention due to their biological roles in the telomeres and promoter regions. These structures show wide diversity in topology, however, development of methods for structural classification of G-quadruplexes has been evaded for a long time. There has been a limited number of studies aiming to bring forth a secondary structure classification method. The situation was even more complex than imagined, since the discovery of bulged and mismatched G-quadruplexes while most of the available tools fail to distinguish these non-canonical G-quadruplex motifs. Moreover, the interpretation of their analysis output still requires expert knowledge. In this study, we propose a new method for identification of unimolecular G-Quadruplexes and classification by secondary structures based on three-dimensional structural data. Briefly, coordinates of guanines are processed to identify tetrads, loops and bulges. Then, we present the secondary structure in the form of a depiction which shows the loop types, bulges, and guanines that participate in each tetrad. Moreover, CIIS-GQ identifies non-guanine nucleotides that joins the G-tetrads and forms multiplets. Finally, the results of our study are compared with DSSR and ElTetrado classification methods, and the advantages of the proposed depiction method for representing secondary structures were discussed. The source code of the method can be accessed via https://github.com/TugayDirek/CIIS-GQ.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"38 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Steered molecular dynamics simulation as a post-process to optimize the iBRAB-designed Fab model","authors":"Phuc-Chau Do, Vy T. T. Le","doi":"10.1007/s10822-024-00575-z","DOIUrl":"10.1007/s10822-024-00575-z","url":null,"abstract":"<div><p>Therapeutic monoclonal antibodies are an effective method of treating acute infectious diseases. However, knowing which of the produced antibodies in the vast number of human antibodies can cure the disease requires a long time and advanced technology. The previously introduced <i>i</i>BRAB method relies on studied antibodies to design a broad-spectrum antibody capable of neutralizing antigens of many different Influenza A viral strains. To evaluate the antigen-binding fragment as an applicable drug, the therapeutic antibody profiles providing guidelines collected from clinically staged therapeutic antibodies were used to access different measurements. Although the evaluated values were within an accepted range, the modification in the amino acid sequence is required for better properties. Thus, using the steered molecular dynamics (SMD) simulation to determine the binding capacity of amino acids in the functional region, the profile of interacted amino acids of Fab with the antigen was established for modified reference. As a result, the model was modified with amino acids elimination at positions 96–97 in the heavy chain and 26–27, 91, 96–97, and 102–103 in the light chain, which has better Therapeutic Antibody Profiler evaluations than the original designation. Thus again, SMD simulation is a promising computational approach for post-modification in rational drug design.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"38 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142492596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}