Andrew K. Gao , Trevor B. Chen , Valentina L. Kouznetsova , Igor F. Tsigelny
{"title":"Machine-learning-based virtual screening and ligand docking identify potent HIV-1 protease inhibitors","authors":"Andrew K. Gao , Trevor B. Chen , Valentina L. Kouznetsova , Igor F. Tsigelny","doi":"10.1016/j.aichem.2023.100014","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100014","url":null,"abstract":"<div><p>The human immunodeficiency virus type 1 (HIV-1) is a retrovirus that can cause acquired immunodeficiency syndrome (AIDS), severely weakening the immune system. The United Nations estimates that there are 37.7 million people with HIV worldwide. HIV-1 protease (PR) cleaves polyproteins to create the individual proteins that comprise an HIV virion. Inhibiting PR prevents the creation of new virions, rendering PR an attractive antiviral target. In the present study, a machine-learning regression model was constructed to predict pIC<sub>50</sub> bioactivity concentrations using data from 2547 experimentally characterized PR inhibitors. The model achieved Pearson correlation coefficient of 0.88, R-squared of 0.78, and a RMSE of 0.717 in pIC<sub>50</sub> units on unseen data using 199 high-variance PubChem substructure fingerprints. The SWEETLEAD database of approximately 4300 traditional medicine compounds and drugs from around the world was screened using the model. Fifty molecules were identified as highly potent, with pIC<sub>50</sub> of at least 7.301 (IC<sub>50</sub> <= 50 nM). Nine of these molecules, such as lopinavir and ritonavir, are known antiviral drugs. The highly potent molecules were ligand-docked to the 3D structure of HIV protease at the active site. Dihydroergotamine mesylate (daechu alkaloids) had a very strong binding affinity of −13.2, outperforming all known antiviral drugs that were tested. It was also predicted by the model to have an IC<sub>50</sub> of 9.16 nM, which is considered very low and desirable. Overall, this study demonstrates the use of machine-learning regression models for virtual screening and highlights several drugs with significant promise for repurposing against HIV-1. Future steps include testing dihydroergotamine mesylate and other candidates in vitro.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100014"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49744902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Orders-of-coupling representation achieved with a single neural network with optimal neuron activation functions and without nonlinear parameter optimization","authors":"Sergei Manzhos, Manabu Ihara","doi":"10.1016/j.aichem.2023.100013","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100013","url":null,"abstract":"<div><p>Orders-of-coupling representations (representations of multivariate functions with low-dimensional functions that depend on subsets of original coordinates corresponding to different orders of coupling) are useful in many applications, for example, in computational chemistry and other applications, especially where integration is needed. Examples include N-mode approximations and many-body expansions. Such representations can be conveniently built with machine learning methods, and previously, methods building the lower-dimensional terms of such representations with neural networks [e.g. Comput. Phys. Commun. 180 (2009) 2002] and Gaussian process regressions [e.g. Mach. Learn. Sci. Technol. 3 (2022) 01LT02] were proposed. Here, we show that neural network models of orders-of-coupling representations can be easily built by using a recently proposed neural network with optimal neuron activation functions computed with a first-order additive Gaussian process regression [arXiv:2301.05567] and avoiding non-linear parameter optimization. Examples are given of representations of molecular potential energy surfaces.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100013"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49744706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chemical space navigation by machine learning models for discovering selective MAO-B enzyme inhibitors for Parkinson’s disease","authors":"P. Catherene Tomy, C. Gopi Mohan","doi":"10.1016/j.aichem.2023.100012","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100012","url":null,"abstract":"<div><p>Monoamine Oxidase-B (MAO-B) is a key neuroprotective target that breaks neurotransmitters such as dopamine and releases highly reactive free radicals as the by-product. Its over-expression in the brain observed due to ageing and neurodegenerative diseases contributes to worsening neuronal degeneration. Being the primary enzyme for dopamine metabolism in <em>the substantia nigra</em> of the brain and due to the lack of efficient drug candidates, MAO-B selective, reversible inhibition is hot topic of research in Parkinson’s disease (PD). This study developed machine learning (ML) models that predict the activity of experimentally tested indole and indazole derivatives against MAO-B using linear genetic function approximation (GFA) and two non-linear support vector machine (SVM) and artificial neural network (ANN) techniques. ANN model with an R<sup>2</sup> of 0.9704 for the training dataset, <span><math><mrow><msup><mrow><mi>q</mi></mrow><mrow><mn>2</mn></mrow></msup><mspace></mspace></mrow></math></span>of 0.9436 for cross-validation and <span><math><mrow><msup><mrow><mi>r</mi></mrow><mrow><mn>2</mn></mrow></msup><mspace></mspace></mrow></math></span>of 0.9025 for the test dataset were identified as the best-performing ML model with the seven significant molecular descriptors CATS2D_04_DA, CATS2D_05_DA, CATS3D_06_LL, Mor04u, Mor25m, P_VSA_v_2 and nO. The robust ML model was then employed to design novel MAO-B inhibitors with similar core scaffolds and their biological activity prediction. ANN model was further employed in the virtual screening of 4356 molecules from the ChEMBL database. Applicability domain analysis and pharmacokinetic and toxicity profiles predicted three newly designed molecules (22 N, 23 N and 24 N) and two virtually screened best ChEMBL molecules as potential drug candidates using the ANN ML model. Molecular docking studies of the best-identified compounds were performed to understand the molecular mechanism of interactions having high binding energy and selectivity with the MAO-B enzyme. The current study shortlisted 5 potential lead compounds as potent and selective MAO-B inhibitors, which could further be carried forward for in vitro and in vivo studies to discover small molecules against neurodegenerative disease.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100012"},"PeriodicalIF":0.0,"publicationDate":"2023-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49744704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of artificial intelligence and machine learning in early detection of adverse drug reactions (ADRs) and drug-induced toxicity","authors":"Siyun Yang, Supratik Kar","doi":"10.1016/j.aichem.2023.100011","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100011","url":null,"abstract":"<div><p>Adverse drug reactions (ADRs) and drug-induced toxicity are major challenges in drug discovery, threatening patient safety and dramatically increasing healthcare expenditures. Since ADRs and toxicity are not as visible as infectious diseases, the potential consequences are considerable. Early detection of ADRs and drug-induced toxicity is an essential indicator of a drug's viability and safety profile. The introduction of artificial intelligence (AI) and machine learning (ML) approaches has resulted in a paradigm shift in the field of early ADR and toxicity detection. The application of these modern computational methods allows for the rapid, thorough, and precise prediction of probable ADRs and toxicity even before the drug’s practical synthesis as well as preclinical and clinical trials, resulting in more efficient and safer medications with a lesser chance of drug’s withdrawal. This present review offers an in-depth examination of the role of AI and ML in the early detection of ADRs and toxicity, incorporating a wide range of methodologies ranging from data mining to deep learning followed by a list of important databases, modeling algorithms, and software that could be used in modeling and predicting a series of ADRs and toxicity. This review also provides a complete reference to what has been performed and what might be accomplished in the field of AI and ML-based early identification of ADRs and drug-induced toxicity. By shedding light on the capabilities of these technologies, it highlights their enormous potential for revolutionizing drug discovery and improving patient safety.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100011"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49757852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An interpretable graph representation learning model for accurate predictions of drugs aqueous solubility","authors":"Qiufen Chen , Yuewei Zhang , Peng Gao , Jun Zhang","doi":"10.1016/j.aichem.2023.100010","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100010","url":null,"abstract":"<div><p>As increasingly more data science-driven approaches have been applied for compound properties predictions in the domain of drug discovery, such kinds of methods have displayed considerable accuracy compared to conventional ones. In this work, we proposed an interpretable graph learning representation model, SolubNet, for drug aqueous solubility prediction. The comprehensive evaluation demonstrated that SolubNet can successfully capture the quantitative structure-property relationship and can be interpreted with layer-wise relevance propagation (LRP) algorithm regarding how prediction values are generated from original input structures. The key advantage of SolubNet lies in the fact that it includes 3 layers of Topology Adaptive Graph Convolutional Networks which can efficiently perceive chemical local environments. SolubNet showed high performance in several tasks for drugs’ aqueous solubility prediction. LRP revealed that SolubNet can identify high and low polar regions of a given molecule, assigning them reasonable weights to predict the final solubility, in a way highly compatible with chemists’ intuition. We are confident that such a flexible yet interpretable and accurate tool will largely enhance the efficiency of drug discovery, and will even contribute to the methodology development of computational pharmaceutics.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100010"},"PeriodicalIF":0.0,"publicationDate":"2023-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49764057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discovery of novel CaMK-II inhibitor for the possible mitigation of arrhythmia through pharmacophore modelling, virtual screening, molecular docking, and toxicity prediction","authors":"Niyati Parekh , Sarthak Lakhani , Ayushi Patel , Dhyanesh Oza , Bhumika Patel , Ruchi Yadav , Udit Chaube","doi":"10.1016/j.aichem.2023.100009","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100009","url":null,"abstract":"<div><p>In the present research, a few well-known artificial intelligence tools were explored for efficient hit selection which could be further utilized for the discovery of CaMK-II inhibitors for the Treatment of arrhythmia. To achieve the desired goals pharmacophore modelling, database retrieval, molecular docking studies, and toxicity prediction were performed. Pharmacophore modelling was performed with the Pharmit open-source database which gave the features viz. Hydrogen Bond Donor, Hydrogen Bond Acceptor, and Hydrophobic. This pharmacophore is generated with the aid of the protein of CaMK-II (PDB ID: 2WEL) and co-crystallized ligand K88. Further, this generated pharmacophore was screened through the various Pharmit databases which include CHEMBL30, ChemDiv, ChemSpace, MCULE, MolPort, NCI Open Chemical Repository, Lab Network, and ZINC. Further, the top two hits from each database that has maximum similarity with the pharmacophore have been selected for the molecular docking and ADMET studies. Among, all the hits CHEMBL 1952032 showed good binding interactions with CaMK-II. Also, it was found to be non-toxic upon evaluation through the OSIRIS property explorer. In the future, it can be explored against the CaMK-II for the development of novel CaMK-II inhibitors which can be used for the mitigation of arrhythmia.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100009"},"PeriodicalIF":0.0,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49764056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Orders of coupling representations as a versatile framework for machine learning from sparse data in high-dimensional spaces","authors":"Sergei Manzhos , Tucker Carrington , Manabu Ihara","doi":"10.1016/j.aichem.2023.100008","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100008","url":null,"abstract":"<div><p>Machine learning (ML) techniques are already widely and increasingly used in diverse applications in science and technology, including computational chemistry. Specifically in computational chemistry, neural networks (NN) and kernel methods such as Gaussian process regressions (GPR) have been increasingly used for the construction of potential functions and functionals for density functional theory. While ML techniques have a number of advantages vs intuition-based models, notably their generality and black-box nature, they are still challenged when faced with high dimensionality of the feature space or low and uneven data density – in part because of their general nature. We review recent works using methods such as NNs and GPR as building blocks of composite methods in the framework of an expansion over orders of coupling. We introduce models using NN or GPR-based components as part of HDMR (high-dimensional model representations)-based structures. HDMR is a formalization of orders-of-coupling representations that include the many-body and N-mode representations well known in computational chemistry and allows, in particular, building all terms from one dataset of arbitrarily distributed data. The resulting HDMR-NN and HDMR-GPR combinations and NN with HDMR-GPR derived neuron activation functions not requiring non-linear optimization enhance machine learning capabilities in high dimensional spaces and or with sparse data.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100008"},"PeriodicalIF":0.0,"publicationDate":"2023-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49764054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How do centrality measures help to predict similarity patterns in molecular chemical structural graphs?","authors":"Nirmala Parisutham","doi":"10.1016/j.aichem.2023.100007","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100007","url":null,"abstract":"<div><p>The proposed work uses centrality measures based heuristic method to improve the efficiency of the solution for the similarity search problem in molecular chemical graphs by effectively identifying central candidate or representative candidate nodes, which simplify the complex processes involved while detecting a large-sized maximal common connected edge subgraph. After analyzing the structure of the two input molecular chemical graphs, a Tensor Product graph is created. This newly built graph is further analyzed to get the similarity pattern of the input graphs. It is an open problem to decide which centrality measure selects the best central candidate node in Tensor Product graphs to get a large maximal common connected edge graph. Since each centrality measure is analyses, the given graph is uniquely based on its own specific aspects. The proposed work offers directions on using various centrality measures to determine a big-sized maximal common connected subgraph for two molecular chemical input graphs. It also analyses seven centrality measures to select the best candidate node in the Tensor Product graph of two input chemical molecular graphs. Based on the obtained results, the betweenness centrality and degree centrality measures exclusively help to get large-sized similarity patterns.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100007"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49764053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonathan D. Hirst , Samuel Boobier , Jennifer Coughlan , Jessica Streets , Philippa L. Jacob , Oska Pugh , Ender Özcan , Simon Woodward
{"title":"ML meets MLn: Machine learning in ligand promoted homogeneous catalysis","authors":"Jonathan D. Hirst , Samuel Boobier , Jennifer Coughlan , Jessica Streets , Philippa L. Jacob , Oska Pugh , Ender Özcan , Simon Woodward","doi":"10.1016/j.aichem.2023.100006","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100006","url":null,"abstract":"<div><p>The benefits of using machine learning approaches in the design, optimisation and understanding of homogeneous catalytic processes are being increasingly realised. We focus on the understanding and implementation of key concepts, which serve as conduits to more advanced chemical machine learning literature, much of which is (presently) outside the area of homogeneous catalysis. Potential pitfalls in the ‘workflow’ procedures needed in the machine learning process are identified and all the examples provided are in a chemical sciences context, including several from ‘real world’ catalyst systems. Finally, potential areas of expansion and impact for machine learning in homogeneous catalysis in the future are considered.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100006"},"PeriodicalIF":0.0,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49764052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ya Ju Fan , Jonathan E. Allen , Kevin S. McLoughlin , Da Shi , Brian J. Bennion , Xiaohua Zhang , Felice C. Lightstone
{"title":"Evaluating point-prediction uncertainties in neural networks for protein-ligand binding prediction","authors":"Ya Ju Fan , Jonathan E. Allen , Kevin S. McLoughlin , Da Shi , Brian J. Bennion , Xiaohua Zhang , Felice C. Lightstone","doi":"10.1016/j.aichem.2023.100004","DOIUrl":"10.1016/j.aichem.2023.100004","url":null,"abstract":"<div><p>Neural Network (NN) models provide potential to speed up the drug discovery process and reduce its failure rates. The success of NN models requires uncertainty quantification (UQ) as drug discovery explores chemical space beyond the training data distribution. Standard NN models do not provide uncertainty information. Some methods require changing the NN architecture or training procedure, limiting the selection of NN models. Moreover, predictive uncertainty can come from different sources. It is important to have the ability to separately model different types of predictive uncertainty, as the model can take assorted actions depending on the source of uncertainty. In this paper, we examine UQ methods that estimate different sources of predictive uncertainty for NN models aiming at protein-ligand binding prediction. We use our prior knowledge on chemical compounds to design the experiments. By utilizing a visualization method we create non-overlapping and chemically diverse partitions from a collection of chemical compounds. These partitions are used as training and test set splits to explore NN model uncertainty. We demonstrate how the uncertainties estimated by the selected methods describe different sources of uncertainty under different partitions and featurization schemes and the relationship to prediction error.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 1","pages":"Article 100004"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/9f/25/nihms-1912151.PMC10426331.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10019861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}