Molecular InformaticsPub Date : 2024-10-01Epub Date: 2024-07-24DOI: 10.1002/minf.202400046
Guillaume Patient, Corentin Bedart, Naim A Khan, Nicolas Renault, Amaury Farce
{"title":"Distinct binding hotspots for natural and synthetic agonists of FFA4 from in silico approaches.","authors":"Guillaume Patient, Corentin Bedart, Naim A Khan, Nicolas Renault, Amaury Farce","doi":"10.1002/minf.202400046","DOIUrl":"10.1002/minf.202400046","url":null,"abstract":"<p><p>FFA4 has gained interest in recent years since its deorphanization in 2005 and the characterization of the Free Fatty Acids receptors family for their therapeutic potential in metabolic disorders. The expression of FFA4 (also known as GPR120) in numerous organs throughout the human body makes this receptor a highly potent target, particularly in fat sensing and diet preference. This offers an attractive approach to tackle obesity and related metabolic diseases. Recent cryo-EM structures of the receptor have provided valuable information for a potential active state although the previous studies of FFA4 presented diverging information. We performed molecular docking and molecular dynamics simulations of four agonist ligands, TUG-891, Linoleic acid, α-Linolenic acid, and Oleic acid, based on a homology model. Our simulations, which accumulated a total of 2 μs of simulation, highlighted two binding hotspots at Arg99<sup>2.64</sup> and Lys293 (ECL3). The results indicate that the residues are located in separate areas of the binding pocket and interact with various types of ligands, implying different potential active states of FFA4 and a highly adaptable binding intra-receptor pocket. This article proposes additional structural characteristics and mechanisms for agonist binding that complement the experimental structures.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141752164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2024-10-01Epub Date: 2024-08-07DOI: 10.1002/minf.202400008
Shivam Kumar Vyas, Avik Das, Upadhyayula Suryanarayana Murty, Vaibhav A Dixit
{"title":"Sulfotransferase-mediated phase II drug metabolism prediction of substrates and sites using accessibility and reactivity-based algorithms.","authors":"Shivam Kumar Vyas, Avik Das, Upadhyayula Suryanarayana Murty, Vaibhav A Dixit","doi":"10.1002/minf.202400008","DOIUrl":"10.1002/minf.202400008","url":null,"abstract":"<p><p>Sulphotransferases (SULTs) are a major phase II metabolic enzyme class contributing ~20 % to the Phase II metabolism of FDA-approved drugs. Ignoring the potential for SULT-mediated metabolism leaves a strong potential for drug-drug interactions, often causing late-stage drug discovery failures or black-boxed warnings on FDA labels. The existing models use only accessibility descriptors and machine learning (ML) methods for class and site of sulfonation (SOS) predictions for SULT. In this study, a variety of accessibility, reactivity, and hybrid models and algorithms have been developed to make accurate substrate and SOS predictions. Unlike the literature models, reactivity parameters for the aliphatic or aromatic hydroxyl groups (R/Ar-O-H), the Bond Dissociation Energy (BDE) gave accurate models with a True Positive Rate (TPR)=0.84 for SOS predictions. We offer mechanistic insights to explain these novel findings that are not recognized in the literature. The accessibility parameters like the ratio of Chemgauss4 Score (CGS) and Molecular Weight (MW) CGS/MW and distance from cofactor (Dis) were essential for class predictions and showed TPR=0.72. Substrates consistently had lower BDE, Dis, and CGS/MW than non-substrates. Hybrid models also performed acceptablely for SOS predictions. Using the best models, Algorithms gave an acceptable performance in class prediction: TPR=0.62, False Positive Rate (FPR)=0.24, Balanced accuracy (BA)=0.69, and SOS prediction: TPR=0.98, FPR=0.60, and BA=0.69. A rule-based method was added to improve the predictive performance, which improved the algorithm TPR, FPR, and BA. Validation using an external dataset of drug-like compounds gave class prediction: TPR=0.67, FPR=0.00, and SOS prediction: TPR=0.80 and FPR=0.44 for the best Algorithm. Comparisons with standard ML models also show that our algorithm shows higher predictive performance for classification on external datasets. Overall, these models and algorithms (SOS predictor) give accurate substrate class and site (SOS) predictions for SULT-mediated Phase II metabolism and will be valuable to the drug discovery community in academia and industry. The SOS predictor is freely available for academic/non-profit research via the GitHub link.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141897838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2024-09-01Epub Date: 2024-06-12DOI: 10.1002/minf.202300335
Colin Bournez, José-Manuel Gally, Samia Aci-Sèche, Philippe Bernard, Pascal Bonnet
{"title":"Virtual screening of natural products to enhance melanogenosis.","authors":"Colin Bournez, José-Manuel Gally, Samia Aci-Sèche, Philippe Bernard, Pascal Bonnet","doi":"10.1002/minf.202300335","DOIUrl":"10.1002/minf.202300335","url":null,"abstract":"<p><p>Natural products have long been an important source of inspiration for medicinal chemistry and drug discovery. In the cosmetic field, they remain the major elements of the composition and serve as marketing asset. Recent research showed the implication of salt-inducible kinases on the melanin production in skin via MITF regulation. Finding new potent modulators on such target could open the way to several cosmetic applications to attenuate visible signs of photoaging and improve the tan without sun. Since virtual screening can be a powerful tool for detecting hit compounds in the early stages of a drug discovery process, we applied this method on salt-inducible kinase 2 to discover potential interesting compounds. Here, we present the different steps from the construction of a database of natural products, to the validation of a docking protocol and the results of the virtual screening. Hits from the screening were tested in vitro to confirm their efficiency and results are discussed.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141306333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2024-09-01Epub Date: 2024-07-08DOI: 10.1002/minf.202300160
Shrilakshmi Sheshagiri Rao, Shankar V Kundapura, Debayan Dey, Chandrasekaran Palaniappan, Kanagaraj Sekar, Ananda Kulal, Udupi A Ramagopal
{"title":"Cumulative phylogenetic, sequence and structural analysis of Insulin superfamily proteins provide unique structure-function insights.","authors":"Shrilakshmi Sheshagiri Rao, Shankar V Kundapura, Debayan Dey, Chandrasekaran Palaniappan, Kanagaraj Sekar, Ananda Kulal, Udupi A Ramagopal","doi":"10.1002/minf.202300160","DOIUrl":"10.1002/minf.202300160","url":null,"abstract":"<p><p>The insulin superfamily proteins (ISPs), in particular, insulin, IGFs and relaxin proteins are key modulators of animal physiology. They are known to have evolved from the same ancestral gene and have diverged into proteins with varied sequences and distinct functions, but maintain a similar structural architecture stabilized by highly conserved disulphide bridges. The recent surge of sequence data and the structures of these proteins prompted a need for a comprehensive analysis, which connects the evolution of these sequences (427 sequences) in the light of available functional and structural information including representative complex structures of ISPs with their cognate receptors. This study reveals (a) unusually high sequence conservation of IGFs (>90 % conservation in 184 sequences) and provides a possible structure-based rationale for such high sequence conservation; (b) provides an updated definition of the receptor-binding signature motif of the functionally diverse relaxin family members (c) provides a probable non-canonical C-peptide cleavage site in a few insulin sequences. The high conservation of IGFs appears to represent a classic case of resistance to sequence diversity exerted by physiologically important interactions with multiple partners. We also propose a probable mechanism for C-peptide cleavage in a few distinct insulin sequences and redefine the receptor-binding signature motif of the relaxin family. Lastly, we provide a basis for minimally modified insulin mutants with potential therapeutic application, inspired by concomitant changes observed in other insulin superfamily protein members supported by molecular dynamics simulation.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141555195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prediction of blood-brain barrier permeability using machine learning approaches based on various molecular representation.","authors":"Li Liang, Zhiwen Liu, Xinyi Yang, Yanmin Zhang, Haichun Liu, Yadong Chen","doi":"10.1002/minf.202300327","DOIUrl":"10.1002/minf.202300327","url":null,"abstract":"<p><p>The assessment of compound blood-brain barrier (BBB) permeability poses a significant challenge in the discovery of drugs targeting the central nervous system. Conventional experimental approaches to measure BBB permeability are labor-intensive, cost-ineffective, and time-consuming. In this study, we constructed six machine learning classification models by combining various machine learning algorithms and molecular representations. The model based on ExtraTree algorithm and random partitioning strategy obtains the best prediction result, with AUC value of 0.932±0.004 and balanced accuracy (BA) of 0.837±0.010 for the test set. We employed the SHAP method to identify important features associated with BBB permeability. In addition, matched molecular pair (MMP) analysis and representative substructure derivation method were utilized to uncover the transformation rules and distinctive structural features of BBB permeable compounds. The machine learning models proposed in this work can serve as an effective tool for assessing BBB permeability in the drug discovery for central nervous system disease.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141306332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mykola V Protopopov, Valentyna V Tararina, Fanny Bonachera, Igor M Dzyuba, Anna Kapeliukha, Serhii Hlotov, Oleksii Chuk, Gilles Marcou, Olga Klimchuk, Dragos Horvath, Erik Yeghyan, Olena Savych, Olga O Tarkhanova, Alexandre Varnek, Yurii S Moroz
{"title":"The freedom space - a new set of commercially available molecules for hit discovery.","authors":"Mykola V Protopopov, Valentyna V Tararina, Fanny Bonachera, Igor M Dzyuba, Anna Kapeliukha, Serhii Hlotov, Oleksii Chuk, Gilles Marcou, Olga Klimchuk, Dragos Horvath, Erik Yeghyan, Olena Savych, Olga O Tarkhanova, Alexandre Varnek, Yurii S Moroz","doi":"10.1002/minf.202400114","DOIUrl":"https://doi.org/10.1002/minf.202400114","url":null,"abstract":"<p><p>The advent of high-performance virtual screening techniques nowadays allows drug designers to explore ultra-large sets of candidate compounds in search of molecules predicted to have desired properties. However, the success of such an endeavor heavily relies on the pertinence (drug-likeness and, foremost, chemical feasibility) of these candidates, or otherwise, virtual screening will return valueless \"hits\", by the garbage in/garbage out principle. The huge popularity of the judiciously enumerated Enamine REAL Space is clear proof of the strength of this Big Data trend in drug discovery. Here we describe a new dataset of make-on-demand compounds called the Freedom space. It follows the principles of Enamine REAL Space and contains highly feasible molecules (synthesis success rate over 75 percent). However, the scaffold and chemography analysis revealed significant differences to both the REAL and biologically annotated compounds from the ChEMBL database. The Freedom Space is a significant extension of the REAL Space and can be utilized for a more comprehensive exploration of the synthetically feasible chemical space in hit finding and hit-to-lead campaigns.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142018020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Philippe Gantzer, Ruben Staub, Yu Harabuchi, Satoshi Maeda, Alexandre Varnek
{"title":"Chemography-guided analysis of a reaction path network for ethylene hydrogenation with a model Wilkinson's catalyst.","authors":"Philippe Gantzer, Ruben Staub, Yu Harabuchi, Satoshi Maeda, Alexandre Varnek","doi":"10.1002/minf.202400063","DOIUrl":"https://doi.org/10.1002/minf.202400063","url":null,"abstract":"<p><p>Visualization and analysis of large chemical reaction networks become rather challenging when conventional graph-based approaches are used. As an alternative, we propose to use the chemical cartography (\"chemography\") approach, describing the data distribution on a 2-dimensional map. Here, the Generative Topographic Mapping (GTM) algorithm - an advanced chemography approach - has been applied to visualize the reaction path network of a simplified Wilkinson's catalyst-catalyzed hydrogenation containing some 10<sup>5</sup> structures generated with the help of the Artificial Force Induced Reaction (AFIR) method using either Density Functional Theory or Neural Network Potential (NNP) for potential energy surface calculations. Using new atoms permutation invariant 3D descriptors for structure encoding, we've demonstrated that GTM possesses the abilities to cluster structures that share the same 2D representation, to visualize potential energy surface, to provide an insight on the reaction path exploration as a function of time and to compare reaction path networks obtained with different methods of energy assessment.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I M Kashafutdinova, A Poyezzhayeva, T Gimadiev, T Madzhidov
{"title":"Active learning approaches in molecule pKi prediction.","authors":"I M Kashafutdinova, A Poyezzhayeva, T Gimadiev, T Madzhidov","doi":"10.1002/minf.202400154","DOIUrl":"https://doi.org/10.1002/minf.202400154","url":null,"abstract":"<p><p>During the early stages of drug design, identifying compounds with suitable bioactivities is crucial. Given the vast array of potential drug databases, it's feasible to assay only a limited subset of candidates. The optimal method for selecting the candidates, aiming to minimize the overall number of assays, involves an active learning (AL) approach. In this work, we benchmarked a range of AL strategies with two main objectives: (1) to identify a strategy that ensures high model performance and (2) to select molecules with desired properties using minimal assays. To evaluate the different AL strategies, we employed the simulated AL workflow based on \"virtual\" experiments. These experiments leveraged ChEMBL datasets, which come with known biological activity values for the molecules. Furthermore, for classification tasks, we proposed the hybrid selection strategy that unified both exploration and exploitation AL strategies into a single acquisition function, defined by parameters n and c. We have also shown that popular minimal margin and maximal variance selection approaches for exploration selection correspond to minimization of the hybrid acquisition function with n=1 and 2 respectively. The balance between the exploration and exploitation strategies can be adjusted using a coefficient (c), making the optimal strategy selection straightforward. The primary strength of the hybrid selection method lies in its adaptability; it offers the flexibility to adjust the criteria for molecule selection based on the specific task by modifying the value of the contribution coefficient. Our analysis revealed that, in regression tasks, AL strategies didn't succeed at ensuring high model performance, however, they were successful in selecting molecules with desired properties using minimal number of tests. In analogous experiments in classification tasks, exploration strategy and the hybrid selection function with a constant c<1 (for n=1) and c≤0.2 (for n=2) were effective in achieving the goal of constructing a high-performance predictive model using minimal data. When searching for molecules with desired properties, exploitation, and the hybrid function with c≥1 (n=1) and c≥0.7 (n=2) demonstrated efficiency identifying molecules in fewer iterations compared to random selection method. Notably, when the hybrid function was set to an intermediate coefficient value (c=0.7), it successfully addressed both tasks simultaneously.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141893849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}