{"title":"Identification of bile salt export pump inhibitors using machine learning: Predictive safety from an industry perspective","authors":"Raquel Rodríguez-Pérez, Grégori Gerebtzoff","doi":"10.1016/j.ailsci.2021.100027","DOIUrl":"10.1016/j.ailsci.2021.100027","url":null,"abstract":"<div><p>Bile salt export pump (BSEP) is a transporter that moves bile salts from hepatocytes into bile canaliculi. BSEP inhibition can result in the toxic accumulation of bile salts in the liver, which has been identified as a risk factor of drug-induced liver injury (DILI). Since DILI is a frequent cause of drug withdrawals from the market or failings in drug development, <em>in vitro</em> BSEP activity is measured with the [<sup>3</sup>H]taurocholate uptake assay and a half-maximal inhibitory concentration (IC<sub>50</sub>) higher than 30 µM is advised. Herein, a machine learning classification model was developed to accurately detect BSEP inhibitors and help in the prioritization of <em>in vitro</em> testing. Regression models for the numerical prediction of IC<sub>50</sub> values were also generated. Classification and regression models for BSEP inhibition have been evaluated on realistic settings, which is critical prior to ML-based decision making in drug discovery programs. This work illustrates how predictive safety can help in early toxicity risk assessment and compound prioritization by leveraging Novartis historical experimental data.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100027"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667318521000271/pdfft?md5=015967de1c7a203aefebbda4387e6f24&pid=1-s2.0-S2667318521000271-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43336869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computational prediction of frequent hitters in target-based and cell-based assays","authors":"Conrad Stork , Neann Mathai , Johannes Kirchmair","doi":"10.1016/j.ailsci.2021.100007","DOIUrl":"10.1016/j.ailsci.2021.100007","url":null,"abstract":"<div><p>Compounds interfering with high-throughput screening (HTS) assay technologies (also known as “badly behaving compounds”, “bad actors”, “nuisance compounds” or “PAINS”) pose a major challenge to early-stage drug discovery. Many of these problematic compounds are “frequent hitters”, and we have recently published a set of machine learning models (“Hit Dexter 2.0”) for flagging such compounds.</p><p>Here we present a new generation of machine learning models which are derived from a large, manually curated and annotated data set. For the first time, these models cover, in addition to target-based assays, also cell-based assays. Our experiments show that cell-based assays behave indeed differently from target-based assays, with respect to hit rates and frequent hitters, and that dedicated models are required to produce meaningful predictions. In addition to these extensions and refinements, we explored a variety of additional setups for modeling, including the combination of four machine learning classifiers (i.e. k-nearest neighbors (KNN), extra trees, random forest and multilayer perceptron) with four sets of descriptors (Morgan2 fingerprints, Morgan3 fingerprints, MACCS keys and 2D physicochemical property descriptors).</p><p>Testing on holdout data as well as data sets of “dark chemical matter” (i.e. compounds that have been extensively tested in biological assays but have never shown activity) and known bad actors show that the multilayer perceptron classifiers in combination with Morgan2 fingerprints outperform other setups in most cases. The best multilayer perceptron classifiers obtained Matthews correlation coefficients of up to 0.648 on holdout data. These models are available via a free web service.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100007"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.ailsci.2021.100007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113386911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingyue Zheng , Carolina Horta Andrade , Jürgen Bajorath
{"title":"Introducing artificial intelligence in the life sciences","authors":"Mingyue Zheng , Carolina Horta Andrade , Jürgen Bajorath","doi":"10.1016/j.ailsci.2021.100001","DOIUrl":"https://doi.org/10.1016/j.ailsci.2021.100001","url":null,"abstract":"","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100001"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.ailsci.2021.100001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136694523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fernando D. Prieto-Martínez , Eli Fernández-de Gortari , José L. Medina-Franco , L. Michel Espinoza-Fonseca
{"title":"An in silico pipeline for the discovery of multitarget ligands: A case study for epi-polypharmacology based on DNMT1/HDAC2 inhibition","authors":"Fernando D. Prieto-Martínez , Eli Fernández-de Gortari , José L. Medina-Franco , L. Michel Espinoza-Fonseca","doi":"10.1016/j.ailsci.2021.100008","DOIUrl":"10.1016/j.ailsci.2021.100008","url":null,"abstract":"<div><p>The search for novel therapeutic compounds remains an overwhelming task owing to the time-consuming and expensive nature of the drug development process and low success rates. Traditional methodologies that rely on the one drug-one target paradigm have proven insufficient for the treatment of multifactorial diseases, leading to a shift to multitarget approaches. In this emerging paradigm, molecules with off-target and promiscuous interactions may result in preferred therapies. In this study, we developed a general pipeline combining machine learning algorithms and a deep generator network to train a dual inhibitor classifier capable of identifying putative pharmacophoric traits. As a case study, we focused on dual inhibitors targeting DNA methyltransferase 1 (DNMT) and histone deacetylase 2 (HDAC2), two enzymes that play a central role in epigenetic regulation. We used this approach to identify dual inhibitors from a novel large natural product database in the public domain. We used docking and atomistic simulations as complementary approaches to establish the ligand-interaction profiles between the best hits and DNMT1/HDAC2. By using the combined ligand- and structure-based approaches, we discovered two promising novel scaffolds that can be used to simultaneously target both DNMT1 and HDAC2. We conclude that the flexibility and adaptability of the proposed pipeline has predictive capabilities of similar or derivative methods and is readily applicable to the discovery of small molecules targeting many other therapeutically relevant proteins.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100008"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.ailsci.2021.100008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9530984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chemistry-centric explanation of machine learning models","authors":"Raquel Rodríguez-Pérez , Jürgen Bajorath","doi":"10.1016/j.ailsci.2021.100009","DOIUrl":"10.1016/j.ailsci.2021.100009","url":null,"abstract":"","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100009"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266731852100009X/pdfft?md5=6bf9c6213d02c78ea314eab068194508&pid=1-s2.0-S266731852100009X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48664977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arthur C. Silva , Joyce V.V.B. Borba , Vinicius M. Alves , Steven U.S. Hall , Nicholas Furnham , Nicole Kleinstreuer , Eugene Muratov , Alexander Tropsha , Carolina Horta Andrade
{"title":"Novel computational models offer alternatives to animal testing for assessing eye irritation and corrosion potential of chemicals","authors":"Arthur C. Silva , Joyce V.V.B. Borba , Vinicius M. Alves , Steven U.S. Hall , Nicholas Furnham , Nicole Kleinstreuer , Eugene Muratov , Alexander Tropsha , Carolina Horta Andrade","doi":"10.1016/j.ailsci.2021.100028","DOIUrl":"10.1016/j.ailsci.2021.100028","url":null,"abstract":"<div><p>Eye irritation and corrosion are fundamental considerations in developing chemicals to be used in or near the eye, from cleaning products to ophthalmic solutions. Unfortunately, animal testing is currently the standard method to identify compounds that cause eye irritation or corrosion. Yet, there is growing pressure on the part of regulatory agencies both in the USA and abroad to develop New Approach Methodologies (NAMs) that help reduce the need for animal testing and address unmet need to modernize safety evaluation of chemical hazards. In furthering the development and applications of computational NAMs in chemical safety assessment, in this study we have collected the largest expertly curated dataset of compounds tested for eye irritation and corrosion, and employed this data to build and validate binary and multi-classification Quantitative Structure-Activity Relationships (QSAR) models that can reliably assess eye irritation/corrosion potential of novel untested compounds. QSAR models were generated with Random Forest (RF) and Multi-Descriptor Read Across (MuDRA) machine learning (ML) methods, and validated using a 5-fold external cross-validation protocol. These models demonstrated high balanced accuracy (CCR of 0.68–0.88), sensitivity (SE of 0.61–0.84), positive predictive value (PPV of 0.65–0.90), specificity (SP of 0.56–0.91), and negative predictive value (NPV of 0.68–0.85). Overall, MuDRA models outperformed RF models and were applied to predict compounds’ irritation/corrosion potential from the Inactive Ingredient Database, which contains components present in FDA-approved drug products, and from the Cosmetic Ingredient Database, the European Commission source of information on cosmetic substances. All models built and validated in this study are publicly available at the STopTox web portal (<span>https://stoptox.mml.unc.edu/</span><svg><path></path></svg>). These models can be employed as reliable tools for identifying potential eye irritant/corrosive compounds.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100028"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9355119/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40588277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Current status of active learning for drug discovery","authors":"Jie Yu , Xutong Li , Mingyue Zheng","doi":"10.1016/j.ailsci.2021.100023","DOIUrl":"10.1016/j.ailsci.2021.100023","url":null,"abstract":"<div><p>Active learning has been widely used in drug discovery and design in recent years. In this viewpoint, we will briefly summarize applications of AL for drug discovery and propose two potential limitations of research in this field.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100023"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667318521000234/pdfft?md5=4b66ffe5aa91d2b4ff6b1d0f8fc4a84c&pid=1-s2.0-S2667318521000234-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46279614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine learning in agriculture domain: A state-of-art survey","authors":"Vishal Meshram , Kailas Patil , Vidula Meshram , Dinesh Hanchate , S.D. Ramkteke","doi":"10.1016/j.ailsci.2021.100010","DOIUrl":"10.1016/j.ailsci.2021.100010","url":null,"abstract":"<div><p>Food is considered as a basic need of human being which can be satisfied through farming. Agriculture not only fulfills humans’ basic needs, but also considered as source of employment worldwide. Agriculture is considered as a backbone of economy and source of employment in the developing countries like India. Agriculture contributes 15.4% in the GDP of India. Agriculture activities are broadly categorized into three major areas: pre-harvesting, harvesting and post harvesting. Advancement in area of machine learning has helped improving gains in agriculture. Machine learning is the current technology which is benefiting farmers to minimize the losses in the farming by providing rich recommendations and insights about the crops. This paper presents an extensive survey of latest machine learning application in agriculture to alleviate the problems in the three areas of pre-harvesting, harvesting and post-harvesting. Application of machine learning in agriculture allows more efficient and precise farming with less human manpower with high quality production.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100010"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667318521000106/pdfft?md5=d2887b03e3cdff4a52c5bc0462338732&pid=1-s2.0-S2667318521000106-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46325215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
José T. Moreira-Filho , Rodolpho C. Braga , Jade Milhomem Lemos , Vinicius M. Alves , Joyce V.V.B. Borba , Wesley S. Costa , Nicole Kleinstreuer , Eugene N. Muratov , Carolina Horta Andrade , Bruno J. Neves
{"title":"BeeToxAI: An artificial intelligence-based web app to assess acute toxicity of chemicals to honey bees","authors":"José T. Moreira-Filho , Rodolpho C. Braga , Jade Milhomem Lemos , Vinicius M. Alves , Joyce V.V.B. Borba , Wesley S. Costa , Nicole Kleinstreuer , Eugene N. Muratov , Carolina Horta Andrade , Bruno J. Neves","doi":"10.1016/j.ailsci.2021.100013","DOIUrl":"10.1016/j.ailsci.2021.100013","url":null,"abstract":"<div><p>Chemically induced toxicity is the leading cause of recent extinction of honey bees. In this regard, we developed an innovative artificial intelligence-based web app (BeeToxAI) for assessing the acute toxicity of chemicals to <em>Apis mellifera</em>. Initially, we developed and externally validated QSAR models for classification (external set accuracy ∼91%) through the combination of Random Forest and molecular fingerprints to predict the potential for chemicals to cause acute contact toxicity and acute oral toxicity to honey bees. Then, we developed and externally validated regression QSAR models (<span><math><msup><mi>R</mi><mn>2</mn></msup></math></span> = 0.75) using Feedforward Neural Networks (FNNs). Afterward, the best models were implemented in the publicly available BeeToxAI web app (<span>http://beetoxai.labmol.com.br/</span><svg><path></path></svg><u>)</u>. The outputs of BeeToxAI are: toxicity predictions with estimated confidence, applicability domain estimation, and color-coded maps of relative structure fragment contributions to toxicity. As an additional assessment of BeeToxAI performance, we collected an external set of pesticides with known bee toxicity that were not included in our modeling dataset. BeeToxAI classification models were able to predict four out of five pesticides correctly. The acute contact toxicity model correctly predicted all of the eight pesticides. Here we demonstrate that BeeToxAI can be used as a rapid new approach methodology for predicting acute toxicity of chemicals in honey bees.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100013"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667318521000131/pdfft?md5=f4b6e96a7da27f813679c0aab8f1014d&pid=1-s2.0-S2667318521000131-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48100929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantifying sources of uncertainty in drug discovery predictions with probabilistic models","authors":"Stanley E. Lazic , Dominic P. Williams","doi":"10.1016/j.ailsci.2021.100004","DOIUrl":"10.1016/j.ailsci.2021.100004","url":null,"abstract":"<div><p>Knowing the uncertainty in a prediction is critical when making expensive investment decisions and when patient safety is paramount, but machine learning (ML) models in drug discovery typically only provide a single best estimate and ignore all sources of uncertainty. Predictions from these models may therefore be over-confident, which can put patients at risk and waste resources when compounds that are destined to fail are further developed. Probabilistic predictive models (PPMs) can incorporate all sources of uncertainty and they return a distribution of predicted values that represents the uncertainty in the prediction. We describe seven sources of uncertainty in PPMs: data, distribution function, mean function, variance function, link function(s), parameters, and hyperparameters. We use toxicity prediction as a running example, but the same principles apply for all prediction models. The consequences of ignoring uncertainty and how PPMs account for uncertainty are also described. We aim to make the discussion accessible to a broad non-mathematical audience. Equations are provided to make ideas concrete for mathematical readers (but can be skipped without loss of understanding) and code is available for computational researchers (<span>https://github.com/stanlazic/ML_uncertainty_quantification</span><svg><path></path></svg>).</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100004"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.ailsci.2021.100004","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90695567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}