Linke He , Yulong Fu , Shaoyi Hou , Guoqiang Wang , Jiabao Zhao , Yipeng Xing , Shuhua Li , Jing Ma
{"title":"Reaction condition- and functional group-specific knowledge discovery: Data- and computation-based analysis on transition-metal-free transformation of organoborons","authors":"Linke He , Yulong Fu , Shaoyi Hou , Guoqiang Wang , Jiabao Zhao , Yipeng Xing , Shuhua Li , Jing Ma","doi":"10.1016/j.aichem.2023.100034","DOIUrl":"10.1016/j.aichem.2023.100034","url":null,"abstract":"<div><p>Gaining insights into overarching trends in chemical reaction systems is crucial for refining reaction conditions and developing novel reactions. These knowledgements include preferences for certain reagents, solvents, and functional group tolerance rules. Traditionally, synthetic chemists have relied on extensive literature searching to acquire the knowledge, a process that is both time-consuming and laborious. To streamline this process, we construct a standardized dataset and knowledge graph on an emerging domain, transition-metal-free transformations with organoborons. The dataset, compiled from organic reaction literature, includes comprehensive details of reaction scopes and conditions. The subsequent construction of a knowledge graph offers a visual representation of the reactions and their interrelationships. Through knowledge graph-based hierarchical analysis and density functional theory (DFT) calculations, we revealed the currently most frequently used reactants, synthetic conditions, and functional group rules in this field. We anticipate this knowledge graph-based approach will accelerate the acquisition and transfer of chemical reaction knowledge, catalyzing the discovery of new reactions. This work provides an automatic and adaptive framework for extracting key insights from reaction datasets to inform the design of novel reactions.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100034"},"PeriodicalIF":0.0,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000349/pdfft?md5=c4bedd7068acf7555c4e457d139943df&pid=1-s2.0-S2949747723000349-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138619235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eric Paquet , Farzan Soleymani , Gabriel St-Pierre-Lemieux , Herna Lydia Viktor , Wojtek Michalowski
{"title":"QuantumBound – Interactive protein generation with one-shot learning and hybrid quantum neural networks","authors":"Eric Paquet , Farzan Soleymani , Gabriel St-Pierre-Lemieux , Herna Lydia Viktor , Wojtek Michalowski","doi":"10.1016/j.aichem.2023.100030","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100030","url":null,"abstract":"<div><p>This paper presents a new approach for protein generation based on one-shot learning and hybrid quantum neural networks. Given a single protein complex, the system learns how to predict the remaining unknown properties, without resorting to autoregression, from the physicochemical properties of the receptor and a prior on the physicochemical properties of the ligand. In contrast with other approaches, QuantumBound learns from a single instance, not from a large dataset, as is common in deep learning. By knowing half of the properties of the ligand, the system can predict the remaining half with an average relative error of 1.43% for a dataset consisting of one hundred and twenty Covid-19 spikes complexes. To the best of our knowledge, this is the first time that one-shot learning and hybrid quantum computing have been applied to protein generation.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100030"},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000301/pdfft?md5=7d7af911816c956f9e7248de8a335e1a&pid=1-s2.0-S2949747723000301-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138489573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A machine learning approach for predicting the reactivity power of hypervalent iodine compounds","authors":"Vaneet Saini , Ramesh Kataria, Shruti Rajput","doi":"10.1016/j.aichem.2023.100032","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100032","url":null,"abstract":"<div><p>The knowledge of chemical reactivity of substrates is a prerequisite to accurately design a chemical reaction; however, it has been a challenging task due to the slow trial-and-error experimental approaches and the high computational cost associated with in silico investigations. Artificial intelligence techniques could serve as an alternative to efficiently determine the relative reactivity of chemical entities. In the context of this research, we propose an artificial neural network model to predict the bond dissociation energies of hypervalent iodine reagents. An open-source cheminformatics package, namely, Mordred, was employed for calculating various 1D, 2D and topological descriptors. The approach utilizes a dataset of more than 1000 hypervalent iodine reagents, and the bond dissociation energies can be predicted with a remarkable accuracy, as suggested by an R<sup>2</sup> score of 0.97 and a mean absolute error of 1.96 kcal/mol. Owing to the low cost and high efficiency, this machine learning approach can provide an alternative to the theoretical/experimental approaches to rationally design a chemical reaction and without having to go through the hassle of high-throughput experimentation to reach the desired reaction outcome. In an effort to make the model interpretable, a feature importance algorithm was applied, which identified descriptors contributing most to the development of the model. Features describing electronegativity and polarizability are some of the important contributors to the model’s training.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100032"},"PeriodicalIF":0.0,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000325/pdfft?md5=a1dd6d2ca6039f146d3c2a643cbb05b8&pid=1-s2.0-S2949747723000325-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138489574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shijie Tao , Yi Feng , Wenmin Wang , Tiantian Han , Pieter E.S. Smith , Jun Jiang
{"title":"A machine learning protocol for geometric information retrieval from molecular spectra","authors":"Shijie Tao , Yi Feng , Wenmin Wang , Tiantian Han , Pieter E.S. Smith , Jun Jiang","doi":"10.1016/j.aichem.2023.100031","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100031","url":null,"abstract":"<div><p>Geometric information of molecules is closely related to their properties, and vibrational spectroscopy, as a common and powerful analytical tool for determining molecular structure, can assist in gaining precise geometric information. Traditional methods used to delineate spectrum-structure correlations are often expensive, time-consuming, and require extensive professional expertise. In this work, we used a machine learning protocol to construct a map from spectra to molecular geometric structures, and employed Grad-CAM, a convolutional network interpretation technology, to analyze which kinds of chemical information are important for determining our model’s results. The results obtained for six small molecules of differing structures demonstrate that the model is capable of (1) extracting the crucial spectral features that are vital to downstream tasks without necessitating any manual preprocessing, and (2) enabling retrieval of molecular structural information with high precision.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100031"},"PeriodicalIF":0.0,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000313/pdfft?md5=8aed6656166ef3e340a5e81d46b42a1c&pid=1-s2.0-S2949747723000313-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138474987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Beihong Ji, Yuhui Wu, Elena N. Thomas, Jocelyn N. Edwards, Xibing He, Junmei Wang
{"title":"Predicting anti-SARS-CoV-2 activities of chemical compounds using machine learning models","authors":"Beihong Ji, Yuhui Wu, Elena N. Thomas, Jocelyn N. Edwards, Xibing He, Junmei Wang","doi":"10.1016/j.aichem.2023.100029","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100029","url":null,"abstract":"<div><p>To accelerate the discovery of novel drug candidates for Coronavirus Disease 2019 (COVID-19) therapeutics, we reported a series of machine learning (ML)-based models to accurately predict the anti-SARS-CoV-2 activities of screening compounds. We explored 6 popular ML algorithms in combination with 15 molecular descriptors for molecular structures from 9 screening assays in the COVID-19 OpenData Portal hosted by NCATS. As a result, the models constructed by k-nearest neighbors (KNN) using the molecular descriptor GAFF+RDKit achieved the best overall performance with the highest average accuracy of 0.68 and relatively high average area under the receiver operating characteristic curve of 0.74, better than other ML algorithms. Meanwhile, The KNN model for all assays using GAFF+RDKit descriptor outperformed using other descriptors. The overall performance of our developed models was better than REDIAL-2020 (<strong>R</strong>). A web server (<span>https://clickff.org/amberweb/covid-19-cp</span><svg><path></path></svg>) was developed to enable users to predict anti-SARS-CoV-2 activities of arbitrary compounds using the COVID-19-CP (<strong>P</strong>) models. Besides the descriptor-based machine learning models, we also developed graph-based Attentive FP (<strong>A</strong>) models for the 9 assays. We found that the Attentive FP models achieved a comparable performance to that of COVID-19-CP and outperformed the REDIAL-2020 models. The consensus prediction utilizing both COVID-19-CP and Attentive FP can significantly boost the prediction accuracy as assessed by comparing its performance with other three individual models (<strong>R</strong>, <strong>P</strong>, <strong>A</strong>) utilizing the Wilcoxon signed-rank test, thus can ultimately improve the success rate of COVID-19 drug discovery.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100029"},"PeriodicalIF":0.0,"publicationDate":"2023-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000295/pdfft?md5=6026439e3da02cfb256ffaa4b8f13538&pid=1-s2.0-S2949747723000295-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138436572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oyawale Adetunji Moses , Mukhtar Lawan Adam , Zijian Chen , Collins Izuchukwu Ezeh , Hao Huang , Zhuo Wang , Zixuan Wang , Boyuan Wang , Wentao Li , Chensu Wang , Zongyou Yin , Yang Lu , Xue-Feng Yu , Haitao Zhao
{"title":"Machine learning and robot-assisted synthesis of diverse gold nanorods via seedless approach","authors":"Oyawale Adetunji Moses , Mukhtar Lawan Adam , Zijian Chen , Collins Izuchukwu Ezeh , Hao Huang , Zhuo Wang , Zixuan Wang , Boyuan Wang , Wentao Li , Chensu Wang , Zongyou Yin , Yang Lu , Xue-Feng Yu , Haitao Zhao","doi":"10.1016/j.aichem.2023.100028","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100028","url":null,"abstract":"<div><p>The challenge of data-driven synthesis of advanced nanomaterials can be minimized by using machine learning algorithms to optimize synthesis parameters and expedite the innovation process. In this study, a high-throughput robotic platform was employed to synthesize over 1356 gold nanorods with varying aspect ratios via a seedless approach. The developed models guided us in synthesizing gold nanorods with customized morphology, resulting in highly repeatable morphological yield with quantifiable structure-modulating precursor adjustments. The study provides insight into the dynamic relationships between key structure-modulating precursors and the structural morphology of gold nanorods based on the expected aspect ratio. The high-throughput robotic platform-fabricated gold nanorods demonstrated precise aspect ratio control when spectrophotometrically investigated and further validated with the transmission electron microscopy characterization. These findings demonstrate the potential of high-throughput robot-assisted synthesis and machine learning in the synthesis optimization of gold nanorods and aided in the development of models that can aid such synthesis of as-desired gold nanorods.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100028"},"PeriodicalIF":0.0,"publicationDate":"2023-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000283/pdfft?md5=8511642b616c7b56dec42d00c89c3ede&pid=1-s2.0-S2949747723000283-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138448082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Si-Min Qi , Tao Bo , Lei Zhang , Zhi-Fang Chai , Wei-Qun Shi
{"title":"Machine-learning-driven simulations on microstructure, thermodynamic properties, and transport properties of LiCl-KCl-LiF molten salt","authors":"Si-Min Qi , Tao Bo , Lei Zhang , Zhi-Fang Chai , Wei-Qun Shi","doi":"10.1016/j.aichem.2023.100027","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100027","url":null,"abstract":"<div><p>The thermodynamic and transport properties of high-temperature chloride molten salt systems are of great significance for spent fuel reprocessing in the field of nuclear energy engineering. Here, by using machine learning based deep potential (DP) method, we train a high-precision force field model for the LiCl-KCl-LiF system. During force field training, adding new dataset through multiple iterations improves the accuracy of the force field model and its applicability to more configurations. The comparison of density functional theory (DFT) and DP results for the test dataset indicates that our trained DP model has the same accuracy as DFT. Then, we comprehensively investigate the local structure, thermophysical properties, and transport properties of the LiCl-KCl and LiCl-KCl-LiF molten salt systems using the trained DP model. The effects of temperature and LiF concentration on the above properties are analyzed. This work provides guidance for the training of machine learning force fields in molten salt systems and the study of basic physical properties of high-temperature chloride molten salt systems.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100027"},"PeriodicalIF":0.0,"publicationDate":"2023-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000271/pdfft?md5=036ccca1e342d34c04c5cc6fb6e73f01&pid=1-s2.0-S2949747723000271-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138484545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Qu , Paul L. Houston , Qi Yu , Priyanka Pandey , Riccardo Conte , Apurba Nandi , Joel M. Bowman
{"title":"Machine learning software to learn negligible elements of the Hamiltonian matrix","authors":"Chen Qu , Paul L. Houston , Qi Yu , Priyanka Pandey , Riccardo Conte , Apurba Nandi , Joel M. Bowman","doi":"10.1016/j.aichem.2023.100025","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100025","url":null,"abstract":"<div><p>As a follow-up to our recent Communication in the Journal of Chemical Physics [J. Chem. Phys. 159 071101 (2023)], we report and make available the Jupyter Notebook software here. This software performs binary machine learning classification (MLC) with the goal of learning negligible Hamiltonian matrix elements for vibrational dynamics. We illustrate its usefulness for a Hamiltonian matrix for H<sub>2</sub>O by using three MLC algorithms: Random Forest, Support Vector Machine, and Multi-layer Perceptron.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100025"},"PeriodicalIF":0.0,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000258/pdfft?md5=aae23141726aebcb5969aecabfb1ff8f&pid=1-s2.0-S2949747723000258-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138430215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A machine learning classification model for cholesterol-lowering peptides","authors":"Jose Isagani B. Janairo","doi":"10.1016/j.aichem.2023.100026","DOIUrl":"10.1016/j.aichem.2023.100026","url":null,"abstract":"<div><p>Cholesterol-lowering peptides (CLPs) are bioactive biomolecules often derived from food proteins. These short peptides bind with bile acids leading to decreased intestinal absorption of cholesterol. CLPs are promising bioceuticals that can possibly be used to support interventions for the management of high cholesterol. Integrating machine learning (ML) in the screening and discovery workflow for CLP can reduce trial-and-error thereby accelerating and increase the efficiency of the overall process. In this study, a support vector machine model that can distinguish CLPs from non-CLPs is presented. The model was built on a diverse dataset of 1840 peptides, with sequence length that ranges from 4 to 7. The ML model only needs 8 features (VHSE scores), and the most important features were found to be related to peptide polarity and hydrophobicity based on feature importance analysis utilizing Shapley and permutation-based method. The formulated ML classifier is reliable, as demonstrated by AUC >0.7 for a diverse test dataset and AUC >0.9 for a conservative validation dataset composed mainly of the top and bottom CLPs. Overall, the presented ML model presents incremental yet meaningful advances to the application of ML for understanding the nature of CLPs, and their discovery and development.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100026"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294974772300026X/pdfft?md5=0835f2ca55b7c8185903061e3f9f59c0&pid=1-s2.0-S294974772300026X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135764267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Development and application of in silico models to design new antibacterial 5-amino-4-cyano-1,3-oxazoles against colistin-resistant E. coli strains","authors":"Ivan Semenyuta, Diana Hodyna, Vasyl Kovalishyn, Bohdan Demydchuk, Maryna Kachaeva, Stepan Pilyo, Volodymyr Brovarets, Larysa Metelytsia","doi":"10.1016/j.aichem.2023.100024","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100024","url":null,"abstract":"<div><p>Here we describe the results of QSAR analysis based on artificial neural networks, synthesis, activity evaluation and molecular docking of a number of 1,3-oxazole derivatives as anti-E. coli antibacterials. All developed QSAR models showed excellent statistics on training (with determination coefficient q<sup>2</sup> as 0.76 ± 0.01) and test samples (with q<sup>2</sup> as 0.78 ± 0.01). The models were successfully used to identify nine novel 5-amino-4-cyano-1,3-oxazoles with potential anti-E. coli activity. All nine 1,3-oxazoles with predicted high antibacterial potential showed different levels of anti- E. coli in vitro activity. 5-amino-4-cyano-1,3-oxazoles <strong>1</strong> and <strong>3</strong> showed the highest antibacterial activity on average from 17 to 27 mm against MDR, hemolytic MDR and ATCC 25922 <em>E. coli</em> colistin-resistant strains, respectively. The comparative docking analysis demonstrated a possible mechanism of the antibacterial action of the studied 1, 3-oxazoles <strong>1</strong> and <strong>3</strong> through inhibition of <em>E. coli</em> enoyl-ACP reductase (ENR) involved in the biosynthesis of bacterial fatty acids. The localization type is shown of 5-amino-4-cyano-1,3-oxazoles <strong>1</strong> and <strong>3</strong> into the <em>E. coli</em> ENR active site with estimated binding energy from − 10.1 to − 9.5 kcal/mol and hydrogen bonds formation with key amino acids similar to Triclosan. These facts confirm the validity of the hypothesis put forward about the potential antibacterial mechanism of 5-amino-4- cyano-1,3-oxazoles.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100024"},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000246/pdfft?md5=c9085bc34142109bacab7efa22188c7f&pid=1-s2.0-S2949747723000246-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91987332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}