Molecular InformaticsPub Date : 2023-10-01Epub Date: 2023-08-21DOI: 10.1002/minf.202200275
Dmitry Zankov, Timur Madzhidov, Igor Baskin, Alexandre Varnek
{"title":"Conjugated quantitative structure-property relationship models: Prediction of kinetic characteristics linked by the Arrhenius equation.","authors":"Dmitry Zankov, Timur Madzhidov, Igor Baskin, Alexandre Varnek","doi":"10.1002/minf.202200275","DOIUrl":"10.1002/minf.202200275","url":null,"abstract":"<p><p>Conjugated QSPR models for reactions integrate fundamental chemical laws expressed by mathematical equations with machine learning algorithms. Herein we present a methodology for building conjugated QSPR models integrated with the Arrhenius equation. Conjugated QSPR models were used to predict kinetic characteristics of cycloaddition reactions related by the Arrhenius equation: rate constant <math> <semantics><mrow><mi>l</mi> <mi>o</mi> <mi>g</mi> <mi>k</mi></mrow> <annotation>${{rm l}{rm o}{rm g}k}$</annotation> </semantics> </math> , pre-exponential factor <math> <semantics><mrow><mi>l</mi> <mi>o</mi> <mi>g</mi> <mi>A</mi></mrow> <annotation>${{rm l}{rm o}{rm g}A}$</annotation> </semantics> </math> , and activation energy <math> <semantics><msub><mi>E</mi> <mi>a</mi></msub> <annotation>${{E}_{{rm a}}}$</annotation> </semantics> </math> . They were benchmarked against single-task (individual and equation-based models) and multi-task models. In individual models, all characteristics were modeled separately, while in multi-task models <math> <semantics><mrow><mi>l</mi> <mi>o</mi> <mi>g</mi> <mi>k</mi></mrow> <annotation>${{rm l}{rm o}{rm g}k}$</annotation> </semantics> </math> , <math> <semantics><mrow><mi>l</mi> <mi>o</mi> <mi>g</mi> <mi>A</mi></mrow> <annotation>${{rm l}{rm o}{rm g}A}$</annotation> </semantics> </math> and <math> <semantics><msub><mi>E</mi> <mi>a</mi></msub> <annotation>${{E}_{{rm a}}}$</annotation> </semantics> </math> were treated cooperatively. An equation-based model assessed <math> <semantics><mrow><mi>l</mi> <mi>o</mi> <mi>g</mi> <mi>k</mi></mrow> <annotation>${{rm l}{rm o}{rm g}k}$</annotation> </semantics> </math> using the Arrhenius equation and <math> <semantics><mrow><mi>l</mi> <mi>o</mi> <mi>g</mi> <mi>A</mi></mrow> <annotation>${{rm l}{rm o}{rm g}A}$</annotation> </semantics> </math> and <math> <semantics><msub><mi>E</mi> <mi>a</mi></msub> <annotation>${{E}_{{rm a}}}$</annotation> </semantics> </math> values predicted by individual models. It has been demonstrated that the conjugated QSPR models can accurately predict the reaction rate constants at extreme temperatures, at which reaction rate constants hardly can be measured experimentally. Also, in the case of small training sets conjugated models are more robust than related single-task approaches.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10029968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2023-10-01Epub Date: 2023-08-31DOI: 10.1002/minf.202300055
Sandipan Chakraborty, Chiranjeet Saha
{"title":"A multi-tier computational screening framework to effectively search the mutational space of SARS-CoV-2 receptor binding motif to identify mutants with enhanced ACE2 binding abilities.","authors":"Sandipan Chakraborty, Chiranjeet Saha","doi":"10.1002/minf.202300055","DOIUrl":"10.1002/minf.202300055","url":null,"abstract":"<p><p>SARS-CoV-2 gained crucial mutations at the receptor binding domain (RBD) that often changed the course of the pandemic leading to new waves with increased case fatality. Variants are observed with enhanced transmission and immune invasion abilities. Thus, predicting future variants with enhanced transmission ability is a problem of utmost research interest. Here, we have developed a multi-tier exhaustive SARS-CoV-2 mutation screening platform combining MM/GBSA, extensive molecular dynamics simulations, and steered molecular dynamics to identify RBD mutants with enhanced ACE2 binding capability. We have identified four RBM mutations (F490K, S494K, G504F, and the P499L) with significantly higher ACE2 binding abilities than wild-type RBD. Compared to wild-type RBD, they all form stable complexes with more hydrogen bonds and salt-bridge interactions with ACE2. Our simulation data suggest that these mutations allosterically alter the packing of the RBM interface of the RBD-ACE2 complex. As a result, the rupture force required to break the RBD-ACE2 contacts is significantly higher for these mutants.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10491472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2023-08-01Epub Date: 2023-06-16DOI: 10.1002/minf.202300061
Takuya Ehiro
{"title":"Feature importance-based interpretation of UMAP-visualized polymer space.","authors":"Takuya Ehiro","doi":"10.1002/minf.202300061","DOIUrl":"10.1002/minf.202300061","url":null,"abstract":"<p><p>Dimensionality reduction (DR) techniques are used for various purposes such as exploratory data analysis. A commonly employed linear DR technique is principal component analysis (PCA), which is one of the most popular methods for DR. Owing to its linear nature, PCA enables the determination of axes in a low-dimensional space and the calculation of corresponding loading vectors. However, PCA cannot necessarily extract important features of non-linearly distributed data. This study presents a technique aimed at aiding the interpretation of data reduced through non-linear DR methods. In the proposed method, non-linear dimensionally reduced data was clustered via a density-based clustering method. Thereafter, the obtained cluster labels were classified by random forest (RF) classifiers. Further, feature importance (FI) of RF classifiers and Spearman's rank correlation coefficients between predictive probabilities to obtained clusters and original feature values were utilized for characterizing the visualized dimensionally reduced data. The results revealed that the proposed method can provide the interpretable FI-based images of the handwritten digits dataset. Moreover, the proposed method was also applied to the polymer dataset. The study found that incorporating signed FI was advantageous in achieving a meaningful interpretation. Furthermore, Gaussian process regression was utilized to produce intuitive FI-based heatmaps on a 2-dimensional space for greater ease of understanding. Additionally, to enhance the interpretability of the obtained clusters, a feature selection technique called Boruta was applied. The Boruta feature selection method worked effectively to interpret the obtained clusters with limited and commonly important features. Additionally, the study suggested that computing FI solely from substructure-based descriptors could further enhance the interpretability of the results. Finally, the automation of the proposed method was investigated, and through maximizing the target score based on the quality of both the DR and clustering, indicative results were automatically obtained for both the handwritten digits and polymer datasets.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10257338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2023-08-01Epub Date: 2023-08-09DOI: 10.1002/minf.202300006
Alex Nyporko, Olga Tsymbalyuk, Ivan Voiteshenko, Sergiy Starosyla, Mykola Protopopov, Volodymyr Bdzhola
{"title":"Computer-aided design of muscarinic acetylcholine receptor M3 inhibitors: Promising compounds among trifluoromethyl containing hexahydropyrimidinones/thiones.","authors":"Alex Nyporko, Olga Tsymbalyuk, Ivan Voiteshenko, Sergiy Starosyla, Mykola Protopopov, Volodymyr Bdzhola","doi":"10.1002/minf.202300006","DOIUrl":"10.1002/minf.202300006","url":null,"abstract":"<p><p>The new high selective mAChRs M3 inhibitors with IC<sub>50</sub> in nanomolecular ranges, which can be the prototypes for effective COPD and asthma treatment drugs, were discovered with computational approaches among trifluoromethyl containing hexahydropyrimidinones/thiones. Compounds [6-(4-ethoxy-3-methoxy-phenyl)-4-hydroxy-2-thioxo-4-(trifluoromethyl)hexahydropyrimidin-5-yl]-phenyl-methanone (THPT-1) and 5-benzoyl-6-(3,4-dimethoxyphenyl)-4-hydroxy-4-(trifluoromethyl)hexahydropyrimidin-2-one (THPO-4) have been proved to be a highly effective (with IC<sub>50</sub> values of 1.62 ⋅ 10<sup>-7</sup> M and 3.09 ⋅ 10<sup>-9</sup> M, respectively) at the same concentrations significantly competitive inhibit the signal conduction through mAChR3 in comparison with ipratropium bromide, without significant effect on mAChR2, nicotinic cholinergic and adrenergic receptors.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10227176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2023-08-01Epub Date: 2023-07-19DOI: 10.1002/minf.202300026
Mariia Radaeva, Helene Morin, Mohit Pandey, Fuqiang Ban, Maria Guo, Eric LeBlanc, Nada Lallous, Artem Cherkasov
{"title":"Novel Inhibitors of androgen receptor's DNA binding domain identified using an ultra-large virtual screening.","authors":"Mariia Radaeva, Helene Morin, Mohit Pandey, Fuqiang Ban, Maria Guo, Eric LeBlanc, Nada Lallous, Artem Cherkasov","doi":"10.1002/minf.202300026","DOIUrl":"10.1002/minf.202300026","url":null,"abstract":"<p><p>Androgen receptor (AR) inhibition remains the primary strategy to combat the progression of prostate cancer (PC). However, all clinically used AR inhibitors target the ligand-binding domain (LBD), which is highly susceptible to truncations through splicing or mutations that confer drug resistance. Thus, there exists an urgent need for AR inhibitors with novel modes of action. We thus launched a virtual screening of an ultra-large chemical library to find novel inhibitors of the AR DNA-binding domain (DBD) at two sites: protein-DNA interface (P-box) and dimerization site (D-box). The compounds selected through vigorous computational filtering were then experimentally validated. We identified several novel chemotypes that effectively suppress transcriptional activity of AR and its splice variant V7. The identified compounds represent previously unexplored chemical scaffolds with a mechanism of action that evades the conventional drug resistance manifested through LBD mutations. Additionally, we describe the binding features required to inhibit AR DBD at both P-box and D-box target sites.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10218099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular InformaticsPub Date : 2023-08-01Epub Date: 2023-08-21DOI: 10.1002/minf.202300019
Dimitra-Danai Varsou, Haralambos Sarimveis
{"title":"Deimos: A novel automated methodology for optimal grouping. Application to nanoinformatics case studies.","authors":"Dimitra-Danai Varsou, Haralambos Sarimveis","doi":"10.1002/minf.202300019","DOIUrl":"10.1002/minf.202300019","url":null,"abstract":"<p><p>In this study we present deimos, a computational methodology for optimal grouping, applied on the read-across prediction of engineered nanomaterials' (ENMs) toxicity-related properties. The method is based on the formulation and the solution of a mixed-integer optimization program (MILP) problem that automatically and simultaneously performs feature selection, defines the grouping boundaries according to the response variable and develops linear regression models in each group. For each group/region, the characteristic centroid is defined in order to allocate untested ENMs to the groups. The deimos MILP problem is integrated in a broader optimization workflow that selects the best performing methodology between the standard multiple linear regression (MLR), the least absolute shrinkage and selection operator (LASSO) models and the proposed deimos multiple-region model. The performance of the suggested methodology is demonstrated through the application to benchmark ENMs datasets and comparison with other predictive modelling approaches. However, the proposed method can be applied to property prediction of other than ENM chemical entities and it is not limited to ENMs toxicity prediction.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10575617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"De novo drug design based on patient gene expression profiles via deep learning.","authors":"Chikashige Yamanaka, Shunya Uki, Kazuma Kaitoh, Michio Iwata, Yoshihiro Yamanishi","doi":"10.1002/minf.202300064","DOIUrl":"10.1002/minf.202300064","url":null,"abstract":"<p><p>Computational de novo drug design is a challenging issue in medicine, and it is desirable to consider all of the relevant information of the biological systems in a disease state. Here, we propose a novel computational method to generate drug candidate molecular structures from patient gene expression profiles via deep learning, which we call DRAGONET. Our model can generate new molecules that are likely to counteract disease-specific gene expression patterns in patients, which is made possible by exploring the latent space constructed by a transformer-based variational autoencoder and integrating the substructures of disease-correlated molecules. We applied DRAGONET to generate drug candidate molecules for gastric cancer, atopic dermatitis, and Alzheimer's disease, and demonstrated that the newly generated molecules were chemically similar to registered drugs for each disease. This approach is applicable to diseases with unknown therapeutic target proteins and will make a significant contribution to the field of precision medicine.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10576146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ashok Kumar, Ravi Singh, Powsali Ghosh, Ankit Ganeshpurkar, *. Asha, Rayala Swetha, Ravi Singh, Dileep Kumar, Sudheer Kumar Singh
{"title":"Natural‐Language Processing (NLP) based feature extraction technique in Deep‐Learning model to predict the Blood‐Brain‐Barrier permeability of molecules","authors":"Ashok Kumar, Ravi Singh, Powsali Ghosh, Ankit Ganeshpurkar, *. Asha, Rayala Swetha, Ravi Singh, Dileep Kumar, Sudheer Kumar Singh","doi":"10.1002/minf.202200271","DOIUrl":"https://doi.org/10.1002/minf.202200271","url":null,"abstract":"Blood‐Brain‐Barrier (BBB) permeability is one of the critical factors in the success and failure of CNS drug development. The most accurate method of measuring BBB permeability involves clinical experiments, which are labour‐intensive and time‐consuming. Thus, numerous efforts were made to use artificial intelligence (AI) to predict molecules′ BBB permeability. Most of the previous models are based on calculated descriptors and molecular fingerprints. In the present work, we have developed an NLP‐based feature extraction technique in Deep‐Learning models to predict BBB permeability. We have used the B3DB database and generated SELFIES to extract features from the molecules. We have employed word level and N‐gram tokenization to represent words into numeric vectors. The extracted features were fed into several Artificial Neural Network (ANN) and Bi‐directional Long Short‐Term Memory (LSTM) models. The model, ANN‐10 built using ANN and 6‐gram tokenization, performed best on the independent test set. The accuracy, precision, recall, F1, specificity and AUC of ROC scores were found to be 0.89, 0.91, 0.91, 0.91, 0.85 and 0.90. Thus, the developed model can be used for the early screening of CNS drugs.","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41857571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The VEGA web service: multipurpose online tools for molecular modelling and docking analyses.","authors":"Alessandro Pedretti, Serena Vittorio, Emanuela Sabato, Giulio Vistoli, Angelica Mazzolari","doi":"10.1002/minf.202300018","DOIUrl":"https://doi.org/10.1002/minf.202300018","url":null,"abstract":"<p><p>The paper presents the VEGA Online web service, which includes a set of freely available tools deriving from the development of the VEGA suite of programs. In detail, the paper is focused on two tools: the VEGA Web Edition (WE) and the Score tool. The former is a versatile file format converter including relevant features for 2D/3D conversion, for surface mapping and for editing/preparing input files. The Score application allows rescoring docking poses and in particular includes the MLP Interactions Scores (MLPInS) for describing hydrophobic interactions. To the best of our knowledge, this web service is the only available resource by which one can calculate both the virtual log P of a given input molecule according to the MLP approach plus the corresponding MLP surface.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9790706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alicja Gawalska, Natalia Czub, Michał Sapa, Marcin Kołaczkowski, Adam Bucki, Aleksander Mendyk
{"title":"Application of automated machine learning in the identification of multi-target-directed ligands blocking PDE4B, PDE8A, and TRPA1 with potential use in the treatment of asthma and COPD.","authors":"Alicja Gawalska, Natalia Czub, Michał Sapa, Marcin Kołaczkowski, Adam Bucki, Aleksander Mendyk","doi":"10.1002/minf.202200214","DOIUrl":"https://doi.org/10.1002/minf.202200214","url":null,"abstract":"<p><p>Asthma and COPD are characterized by complex pathophysiology associated with chronic inflammation, bronchoconstriction, and bronchial hyperresponsiveness resulting in airway remodeling. A possible comprehensive solution that could fully counteract the pathological processes of both diseases are rationally designed multi-target-directed ligands (MTDLs), combining PDE4B and PDE8A inhibition with TRPA1 blockade. The aim of the study was to develop AutoML models to search for novel MTDL chemotypes blocking PDE4B, PDE8A, and TRPA1. Regression models were developed for each of the biological targets using \"mljar-supervised\". On their basis, virtual screenings of commercially available compounds derived from the ZINC15 database were performed. A common group of compounds placed within the top results was selected as potential novel chemotypes of multifunctional ligands. This study represents the first attempt to discover the potential MTDLs inhibiting three biological targets. The obtained results prove the usefulness of AutoML methodology in the identification of hits from the big compound databases.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9796310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}