Molecular InformaticsPub Date : 2024-07-01Epub Date: 2024-06-12DOI: 10.1002/minf.202300259
Milo Roucairol, Tristan Cazenave
{"title":"Comparing search algorithms on the retrosynthesis problem.","authors":"Milo Roucairol, Tristan Cazenave","doi":"10.1002/minf.202300259","DOIUrl":"10.1002/minf.202300259","url":null,"abstract":"<p><p>In this article we try different algorithms, namely Nested Monte Carlo Search and Greedy Best First Search, on AstraZeneca's open source retrosynthetic tool : AiZynthFinder. We compare these algorithms to AiZynthFinder's base Monte Carlo Tree Search on a benchmark selected from the PubChem database and by Bayer's chemists. We show that both Nested Monte Carlo Search and Greedy Best First Search outperform AstraZeneca's Monte Carlo Tree Search, with a slight advantage for Nested Monte Carlo Search while experimenting on a playout heuristic. We also show how the search algorithms are bounded by the quality of the policy network, in order to improve our results the next step is to improve the policy network.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202300259"},"PeriodicalIF":2.8,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141306331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Robert Fraczkiewicz, Huy Quoc Nguyen, Newton Wu, Nina Kausch‐Busies, Sergio Grimbs, Kai Sommer, Antonius ter Laak, Judith Günther, Björn Wagner, Michael Reutlinger
{"title":"Best of both worlds: An expansion of the state of the art pKa model with data from three industrial partners","authors":"Robert Fraczkiewicz, Huy Quoc Nguyen, Newton Wu, Nina Kausch‐Busies, Sergio Grimbs, Kai Sommer, Antonius ter Laak, Judith Günther, Björn Wagner, Michael Reutlinger","doi":"10.1002/minf.202400088","DOIUrl":"https://doi.org/10.1002/minf.202400088","url":null,"abstract":"In a unique collaboration between Simulations Plus and several industrial partners, we were able to develop a new version 11.0 of the previously published <jats:italic>in silico</jats:italic> pK<jats:sub>a</jats:sub> model, S+pKa, with considerably improved prediction accuracy. The model's training set was vastly expanded by large amounts of experimental data obtained from F. Hoffmann‐La Roche AG, Genentech Inc., and the Crop Science division of Bayer AG. The previous v7.0 of S+pKa was trained on data from public sources and the Pharmaceutical division of Bayer AG. The model has shown dramatic improvements in predictive accuracy when externally validated on three new contributor compound sets. Less expected was v11.0’s improvement in prediction on new compounds developed at Bayer Pharma after v7.0 was released (2013–2023), even without contributing additional data to v11.0. We illustrate chemical space coverage by chemistries encountered in the five domains, public and industrial, outline model construction, and discuss factors contributing to model's success.","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"86 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring drug repositioning possibilities of kinase inhibitors via molecular simulation**","authors":"Qing‐Xin Wang, Jiao Cai, Zi‐Jun Chen, Jia‐Chuan Liu, Jing‐Jing Wang, Hai Zhou, Qing‐Qing Li, Zi‐Xuan Wang, Yi‐Bo Wang, Zhen‐Jiang Tong, Jin Yang, Tian‐Hua Wei, Meng‐Yuan Zhang, Yun Zhou, Wei‐Chen Dai, Ning Ding, Xue‐Jiao Leng, Xiao‐Ying Yin, Shan‐Liang Sun, Yan‐Cheng Yu, Nian‐Guang Li, Zhi‐Hao Shi","doi":"10.1002/minf.202300336","DOIUrl":"https://doi.org/10.1002/minf.202300336","url":null,"abstract":"Kinases, a class of enzymes controlling various substrates phosphorylation, are pivotal in both physiological and pathological processes. Although their conserved ATP binding pockets pose challenges for achieving selectivity, this feature offers opportunities for drug repositioning of kinase inhibitors (KIs). This study presents a cost‐effective in silico prediction of KIs drug repositioning via analyzing cross‐docking results. We established the KIs database (278 unique KIs, 1834 bioactivity data points) and kinases database (357 kinase structures categorized by the DFG motif) for carrying out cross‐docking. Comparative analysis of the docking scores and reported experimental bioactivity revealed that the Atypical, TK, and TKL superfamilies are suitable for drug repositioning. Among these kinase superfamilies, Olverematinib, Lapatinib, and Abemaciclib displayed enzymatic activity in our focused AKT‐PI3K‐mTOR pathway with IC<jats:sub>50</jats:sub> values of 3.3, 3.2 and 5.8 μM. Further cell assays showed IC<jats:sub>50</jats:sub> values of 0.2, 1.2 and 0.6 μM in tumor cells. The consistent result between prediction and validation demonstrated that repositioning KIs via <jats:italic>in silico</jats:italic> method is feasible.","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"27 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alejandro Gómez‐García, Ann‐Kathrin Prinz, Daniel A. Acuña Jiménez, William J. Zamora, Haruna L. Barazorda‐Ccahuana, Miguel Á. Chávez‐Fumagalli, Marilia Valli, Adriano D. Andricopulo, Vanderlan da S. Bolzani, Dionisio A. Olmedo, Pablo N. Solís, Marvin J. Núñez, Johny R. Rodríguez Pérez, Hoover A. Valencia Sánchez, Héctor F. Cortés Hernández, Oscar M. Mosquera Martinez, Oliver Koch, José L. Medina‐Franco
{"title":"Updating and profiling the natural product‐likeness of Latin American compound libraries","authors":"Alejandro Gómez‐García, Ann‐Kathrin Prinz, Daniel A. Acuña Jiménez, William J. Zamora, Haruna L. Barazorda‐Ccahuana, Miguel Á. Chávez‐Fumagalli, Marilia Valli, Adriano D. Andricopulo, Vanderlan da S. Bolzani, Dionisio A. Olmedo, Pablo N. Solís, Marvin J. Núñez, Johny R. Rodríguez Pérez, Hoover A. Valencia Sánchez, Héctor F. Cortés Hernández, Oscar M. Mosquera Martinez, Oliver Koch, José L. Medina‐Franco","doi":"10.1002/minf.202400052","DOIUrl":"https://doi.org/10.1002/minf.202400052","url":null,"abstract":"Compound databases of natural products play a crucial role in drug discovery and development projects and have implications in other areas, such as food chemical research, ecology and metabolomics. Recently, we put together the first version of the Latin American Natural Product database (LANaPDB) as a collective effort of researchers from six countries to ensemble a public and representative library of natural products in a geographical region with a large biodiversity. The present work aims to conduct a comparative and extensive profiling of the natural product‐likeness of an updated version of LANaPDB and the individual ten compound databases that form part of LANaPDB. The natural product‐likeness profile of the Latin American compound databases is contrasted with the profile of other major natural product databases in the public domain and a set of small‐molecule drugs approved for clinical use. As part of the extensive characterization, we employed several chemoinformatics metrics of natural product likeness. The results of this study will capture the attention of the global community engaged in natural product databases, not only in Latin America but across the world.","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"12 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2024-06-01Epub Date: 2024-06-08DOI: 10.1002/minf.202400021
Bryan Dafniet, Olivier Taboureau
{"title":"Prediction of adverse drug reactions due to genetic predisposition using deep neural networks.","authors":"Bryan Dafniet, Olivier Taboureau","doi":"10.1002/minf.202400021","DOIUrl":"10.1002/minf.202400021","url":null,"abstract":"<p><p>Drug development is a long and costly process, often limited by the toxicity and adverse drug reactions (ADRs) caused by drug candidates. Even on the market, some drugs can cause strong ADRs that can vary depending on an individual polymorphism. The development of Genome-wide association studies (GWAS) allowed the discovery of genetic variants of interest that may cause these effects. In this study, the objective was to investigate a deep learning approach to predict genetic variations potentially related to ADRs. We used single nucleotide polymorphisms (SNPs) information from dbSNP to create a network based on ADR-drug-target-mutations and extracted matrixes of interaction to build deep Neural Networks (DNN) models. Considering only information about mutations known to impact drug efficacy and drug safety from PharmGKB and drug adverse reactions based on the MedDRA System Organ Classes (SOCs), these DNN models reached a balanced accuracy of 0.61 in average. Including molecular fingerprints representing structural features of the drugs did not improve the performance of the models. To our knowledge, this is the first model that exploits DNN to predict ADR-drug-target-mutations. Although some improvements are suggested, these models can be of interest to analyze multiple compounds over all of the genes and polymorphisms information accessible and thus pave the way in precision medicine.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400021"},"PeriodicalIF":3.6,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141293574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2024-06-01Epub Date: 2024-06-08DOI: 10.1002/minf.202300167
Abdulsalam Y Bande, Sefer Baday
{"title":"Accelerating Molecular Docking using Machine Learning Methods.","authors":"Abdulsalam Y Bande, Sefer Baday","doi":"10.1002/minf.202300167","DOIUrl":"10.1002/minf.202300167","url":null,"abstract":"<p><p>Virtual screening (VS) is one of the well-established approaches in drug discovery which speeds up the search for a bioactive molecule and, reduces costs and efforts associated with experiments. VS helps to narrow down the search space of chemical space and allows selecting fewer and more probable candidate compounds for experimental testing. Docking calculations are one of the commonly used and highly appreciated structure-based drug discovery methods. Databases for chemical structures of small molecules have been growing rapidly. However, at the moment virtual screening of large libraries via docking is not very common. In this work, we aim to accelerate docking studies by predicting docking scores without explicitly performing docking calculations. We experimented with an attention based long short-term memory (LSTM) neural network for an efficient prediction of docking scores as well as other machine learning models such as XGBoost. By using docking scores of a small number of ligands we trained our models and predicted docking scores of a few million molecules. Specifically, we tested our approaches on 11 datasets that were produced from in-house drug discovery studies. On average, by training models using only 7000 molecules we predicted docking scores of approximately 3.8 million molecules with R<sup>2</sup> (coefficient of determination) of 0.77 and Spearman rank correlation coefficient of 0.85. We designed the system with ease of use in mind. All the user needs to provide is a csv file containing SMILES and their respective docking scores, the system then outputs a model that the user can use for the prediction of docking score for a new molecule.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202300167"},"PeriodicalIF":3.6,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141293572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2024-06-01Epub Date: 2024-06-08DOI: 10.1002/minf.202300312
Myeonghyeon Jeong, Sunyong Yoo
{"title":"FetoML: Interpretable predictions of the fetotoxicity of drugs based on machine learning approaches.","authors":"Myeonghyeon Jeong, Sunyong Yoo","doi":"10.1002/minf.202300312","DOIUrl":"10.1002/minf.202300312","url":null,"abstract":"<p><p>Pregnant females may use medications to manage health problems that develop during pregnancy or that they had prior to pregnancy. However, using medications during pregnancy has a potential risk to the fetus. Assessing the fetotoxicity of drugs is essential to ensure safe treatments, but the current process is challenged by ethical issues, time, and cost. Therefore, the need for in silico models to efficiently assess the fetotoxicity of drugs has recently emerged. Previous studies have proposed successful machine learning models for fetotoxicity prediction and even suggest molecular substructures that are possibly associated with fetotoxicity risks or protective effects. However, the interpretation of the decisions of the models on fetotoxicity prediction for each drug is still insufficient. This study constructed machine learning-based models that can predict the fetotoxicity of drugs while providing explanations for the decisions. For this, permutation feature importance was used to identify the general features that the model made significant in predicting the fetotoxicity of drugs. In addition, features associated with fetotoxicity for each drug were analyzed using the attention mechanism. The predictive performance of all the constructed models was significantly high (AUROC: 0.854-0.974, AUPR: 0.890-0.975). Furthermore, we conducted literature reviews on the predicted important features and found that they were highly associated with fetotoxicity. We expect that our model will benefit fetotoxicity research by providing an evaluation of fetotoxicity risks for drugs or drug candidates, along with an interpretation of that prediction.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202300312"},"PeriodicalIF":3.6,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141293573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2024-06-01Epub Date: 2024-06-08DOI: 10.1002/minf.202300250
Xiyu Chen, Sigrid Leyendecker
{"title":"Kinematic analysis of kinases and their oncogenic mutations - Kinases and their mutation kinematic analysis.","authors":"Xiyu Chen, Sigrid Leyendecker","doi":"10.1002/minf.202300250","DOIUrl":"10.1002/minf.202300250","url":null,"abstract":"<p><p>Protein kinases are crucial cellular enzymes that facilitate the transfer of phosphates from adenosine triphosphate (ATP) to their substrates, thereby regulating numerous cellular activities. Dysfunctional kinase activity often leads to oncogenic conditions. Chosen by using structural similarity to 5UG9, we selected 79 crystal structures from the PDB and based on the position of the phenylalanine side chain in the DFG motif, we classified these 79 crystal structures into 5 group clusters. Our approach applies our kinematic flexibility analysis (KFA) to explore the flexibility of kinases in various activity states and examine the impact of the activation loop on kinase structure. KFA enables the rapid decomposition of macromolecules into different flexibility regions, allowing comprehensive analysis of conformational structures. The results reveal that the activation loop of kinases acts as a \"lock\" that stabilizes the active conformation of kinases by rigidifying the adjacent α-helices. Furthermore, we investigate specific kinase mutations, such as the L858R mutation commonly associated with non-small cell lung cancer, which induces increased flexibility in active-state kinases. In addition, through analyzing the hydrogen bond pattern, we examine the substructure of kinases in different states. Notably, active-state kinases exhibit a higher occurrence of α-helices compared to inactive-state kinases. This study contributes to the understanding of biomolecular conformation at a level relevant to drug development.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202300250"},"PeriodicalIF":3.6,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141288332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2024-05-01Epub Date: 2023-04-07DOI: 10.1002/minf.202200181
M Y Lobanov, M V Slizen, N V Dovidchenko, A V Panfilov, A A Surin, I V Likhachev, O V Galzitskaya
{"title":"Comparison of deep learning models with simple method to assess the problem of antimicrobial peptides prediction.","authors":"M Y Lobanov, M V Slizen, N V Dovidchenko, A V Panfilov, A A Surin, I V Likhachev, O V Galzitskaya","doi":"10.1002/minf.202200181","DOIUrl":"10.1002/minf.202200181","url":null,"abstract":"<p><p>Antibiotic-resistant strains are an emerging threat to public health. The usage of antimicrobial peptides (AMPs) is one of the promising approaches to solve this problem. For the development of new AMPs, it is necessary to have reliable prediction methods. Recently, deep learning approaches have been used to predict AMP. In this paper, we want to compare simple and complex methods for these purposes. We used the BERT transformer to create sequence embeddings and the multilayer perceptron (MLP) and light attention (LA) approaches for classification. One of them reached about 80 % accuracy and specificity in benchmark testing, which is on par with the best available methods. For comparison, we proposed a simple method using only the amino acid composition of proteins or peptides. This method has shown good results, at the level of the best methods. We have prepared a special server for predicting the ability of AMPs by amino acid composition: http://bioproteom.protres.ru/antimicrob/.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202200181"},"PeriodicalIF":3.6,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9262665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}