Yao Li , Wen-Shuo Huang , Li Zhang , Dan Su , Haoran Xu , Xiao-Song Xue
{"title":"Prediction of 19F NMR chemical shift by machine learning","authors":"Yao Li , Wen-Shuo Huang , Li Zhang , Dan Su , Haoran Xu , Xiao-Song Xue","doi":"10.1016/j.aichem.2024.100043","DOIUrl":"https://doi.org/10.1016/j.aichem.2024.100043","url":null,"abstract":"<div><p>Fluorine-19 (<sup>19</sup>F) is a nucleus of great importance in the field of Nuclear Magnetic Resonance (NMR) spectroscopy due to its high receptivity and wide chemical shift dispersion. <sup>19</sup>F NMR plays crucial roles in both organic synthesis and biomedicine. Herein, a machine learning-based comprehensive <sup>19</sup>F NMR chemical shift prediction model was established based on the experimental <sup>19</sup>F NMR dataset from the book by Dolbier and the open NMR database nmrshiftdb2. Fluorine radical SMILES (Fr-SMILES) that reflected the fluorine chemical equivalence, was designed as the representation of fluorine in the molecule. Model trained with the graph convolution network (GCN) algorithm gave a low mean absolute error (MAE) of 3.636 ppm on the testing set. This model exhibits broad applicability and can effectively predict <sup>19</sup>F NMR shifts for a wide range of organic fluorine molecules. We believe that the current work will provide a powerful tool for not only predicting <sup>19</sup>F NMR shifts but also aiding in the analysis and identification of these shifts in diverse organic fluorine compounds. An online prediction platform was constructed based on the current model, which can be found at <span>https://fluobase.cstspace.cn/fnmr</span><svg><path></path></svg>.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100043"},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747724000010/pdfft?md5=264d1a1fb39301258e870d87dfd75fce&pid=1-s2.0-S2949747724000010-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139107524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unveiling the impact of axial ligands on Fe-N-C complexes through DFT simulation and machine learning analysis","authors":"Hong-Yi Wang, Jirui Jin, Mingjie Liu","doi":"10.1016/j.aichem.2023.100041","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100041","url":null,"abstract":"<div><p>Single-atom catalysts (SACs), featuring isolated metal atoms embedded in graphitic carbon materials, have attracted considerable research interest due to their cost-effectiveness, high catalytic activity, and customizable functionality across various catalytic reactions. Among SACs, the Fe-N<sub>4</sub>-C class has garnered significant attention. Tailoring the properties of Fe-N<sub>4</sub> sites through localized chemical modifications stands as a key strategy for catalyst engineering. Recent experimental and computational investigations have underscored the distinct influence of axial ligands on Fe in modulating the oxygen reduction reaction (ORR) activity. However, the precise quantitative structure-property relationship between ligands and the catalytic properties of the Fe center remains elusive. In this study, we combined the density functional theory (DFT) simulations and machine learning (ML) models to unravel the relationship between the ligand properties and the oxo binding energy. This energy pertains to the binding of an oxygen atom to the Fe center, a fundamental step in ORR. Through the design of 33 ligands and 5 molecular complexes that accommodate the Fe-N<sub>4</sub> moiety, we screened a total of 278 oxo binding energies across an array of ligands and host complexes. Harnessing the power of ML models, we achieved an accurate prediction of these oxo binding energies using features collected from DFT simulations. Notably, the predominant features contributing to the oxo binding energy prediction primarily derived from complexes with attached ligands, rather than isolated ligand properties. We formulated an approach that leverages these critical features and identified the isolated ligand properties capable of effectively predicting these features. This methodology can potentially be applied to investigate other ORR intermediates and a comprehensive understanding of the ligand effect for the ORR activity in SACs can be achieved.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100041"},"PeriodicalIF":0.0,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000416/pdfft?md5=a1922204ca0ef56a175357a9c2778026&pid=1-s2.0-S2949747723000416-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139107525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elizabeth Stippell , Lorena Alzate-Vargas , Kashi N. Subedi , Roxanne M. Tutchton , Michael W.D. Cooper , Sergei Tretiak , Tammie Gibson , Richard A. Messerly
{"title":"Building a DFT+U machine learning interatomic potential for uranium dioxide","authors":"Elizabeth Stippell , Lorena Alzate-Vargas , Kashi N. Subedi , Roxanne M. Tutchton , Michael W.D. Cooper , Sergei Tretiak , Tammie Gibson , Richard A. Messerly","doi":"10.1016/j.aichem.2023.100042","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100042","url":null,"abstract":"<div><p>Despite uranium dioxide (UO<sub>2</sub>) being a widely used nuclear fuel, fuel performance models rely extensively on empirical correlations of material behavior, leveraging the historical operating experience of UO<sub>2</sub>. Mechanistic models that consider an atomistic understanding of the processes governing fuel performance (such as fission gas release and creep) will enable a better description of fuel behavior under non-prototypical conditions such as in new reactor concepts or for modified UO<sub>2</sub> fuel compositions. To this end, molecular dynamics simulation is a powerful tool for rapidly predicting physical properties of proposed fuel candidates. However, the reliability of these simulations depends largely on the accuracy of the atomic forces. Traditionally, these forces are computed using either a classical force field (FF) or density functional theory (DFT). While DFT is relatively accurate, the computational cost is burdensome, especially for <em>f</em>-electron elements, such as actinides. By contrast, classical FFs are computationally efficient but are less accurate. For these reasons, we report a new accurate machine learning interatomic potential (MLIP) for UO<sub>2</sub> that provides high-fidelity reproduction of DFT forces at a similar low cost to classical FFs. We employ an active learning approach that autonomously augments the DFT training data set to iteratively refine the MLIP. To further improve the quality of our predictions, we utilize transfer learning to retrain our MLIP to higher-accuracy DFT+U data. We validate our MLIPs by comparing predicted physical properties (e.g., thermal expansion and elastic properties) with those from existing classical FFs and DFT/DFT+U calculations, as well as with experimental data when available.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100042"},"PeriodicalIF":0.0,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000428/pdfft?md5=b4a181e648f961d53d4c25f1bedd0f01&pid=1-s2.0-S2949747723000428-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139100192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vanessa Y. Zhang , Shayna L. O’Connor , William J. Welsh , Morgan H. James
{"title":"Machine learning models to predict ligand binding affinity for the orexin 1 receptor","authors":"Vanessa Y. Zhang , Shayna L. O’Connor , William J. Welsh , Morgan H. James","doi":"10.1016/j.aichem.2023.100040","DOIUrl":"10.1016/j.aichem.2023.100040","url":null,"abstract":"<div><p>The orexin 1 receptor (OX1R) is a G-protein coupled receptor that regulates a variety of physiological processes through interactions with the neuropeptides orexin A and B. Selective OX1R antagonists exhibit therapeutic effects in preclinical models of several behavioral disorders, including drug seeking and overeating. However, currently there are no selective OX1R antagonists approved for clinical use, fueling demand for novel compounds that act at this target. In this study, we meticulously curated a dataset comprising over 1300 OX1R ligands using a stringent filter and criteria cascade. Subsequently, we developed highly predictive quantitative structure-activity relationship (QSAR) models employing the optimized hyper-parameters for the random forest machine learning algorithm and twelve 2D molecular descriptors selected by recursive feature elimination with a 5-fold cross-validation process. The predictive capacity of the QSAR model was further assessed using an external test set and enrichment study, confirming its high predictivity. The practical applicability of our final QSAR model was demonstrated through virtual screening of the DrugBank database. This revealed two FDA-approved drugs (isavuconazole and cabozantinib) as potential OX1R ligands, confirmed by radiolabeled OX1R binding assays. To our best knowledge, this study represents the first report of highly predictive QSAR models on a large comprehensive dataset of diverse OX1R ligands, which should prove useful for the discovery and design of new compounds targeting this receptor.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100040"},"PeriodicalIF":0.0,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000404/pdfft?md5=43fcbc13b8cafbb292e0ad47efa38a38&pid=1-s2.0-S2949747723000404-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139024429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advances in Artificial Intelligence (AI)-assisted approaches in drug screening","authors":"Samvedna Singh , Himanshi Gupta , Priyanshu Sharma, Shakti Sahi","doi":"10.1016/j.aichem.2023.100039","DOIUrl":"10.1016/j.aichem.2023.100039","url":null,"abstract":"<div><p>Artificial intelligence (AI) is revolutionizing the current process of drug design and development, addressing the challenges encountered in its various stages. By utilizing AI, the efficiency of the process is significantly improved through enhanced precision, reduced time and cost, high-performance algorithms and AI-enabled computer-aided drug design (CADD). Effective drug screening techniques are crucial for identifying potential hit compounds from large volumes of data in compound repositories. The inclusion of AI in drug discovery, including the screening of hit compounds and lead molecules, has proven to be more effective than traditional in vitro screening assays. This article reviews the advancements in drug screening methods achieved through AI-enhanced applications, machine learning (ML), and deep learning (DL) algorithms. It specifically focuses on AI applications in the drug discovery phase, exploring screening strategies and lead optimization techniques such as Quantitative structure-activity relationship (QSAR) modeling, pharmacophore modeling, de novo drug designing, and high-throughput virtual screening. Valuable insights into different aspects of the drug screening process are discussed, highlighting the role of AI-based tools, pipelines, and case studies in simplifying the complexities associated with drug discovery.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100039"},"PeriodicalIF":0.0,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000398/pdfft?md5=559cf38dcee28e753d9d412a799e3406&pid=1-s2.0-S2949747723000398-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139017387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AI's role in pharmaceuticals: Assisting drug design from protein interactions to drug development","authors":"Solene Bechelli , Jerome Delhommelle","doi":"10.1016/j.aichem.2023.100038","DOIUrl":"10.1016/j.aichem.2023.100038","url":null,"abstract":"<div><p>Developing new pharmaceutical compounds is a lengthy, costly, and intensive process. In recent years, the development of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) models has drawn considerable interest in drug discovery. In this review, we discuss recent advances in the field and show how these methods can be leveraged to assist each stage of the drug discovery process. After discussing recent technical progress in the encoding of chemical information via fingerprinting and the emergence of graph-based and generative models, we examine all types of interactions, including drug-target interactions, protein-protein interactions, protein-peptide interactions, and nucleic acid-based interactions. Furthermore, we discuss recent advances enabled by DL models for the prediction of ADMET (Absorption, Distribution, Metabolism, Elimination, Toxicity) properties and of solubility. We also review applications that have emerged in the past two years with the development of models, for instance, on SARS-CoV-2 inhibitors and highlight outstanding challenges.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100038"},"PeriodicalIF":0.0,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000386/pdfft?md5=62dd3ca63edeb03fc522f745a7dce425&pid=1-s2.0-S2949747723000386-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139014049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Empowering research in chemistry and materials science through intelligent algorithms","authors":"Jinglong Lin , Fanyang Mo","doi":"10.1016/j.aichem.2023.100035","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100035","url":null,"abstract":"<div><p>In this review, we explore the integration of intelligent algorithms in chemistry and materials science.We begin by delineating the core principles of Machine Learning, Deep Learning, and optimization algorithms, highlighting their bespoke adaptation to these scientific domains. The focus then shifts to the critical processes of data management, including collection, refinement, and feature engineering, alongside strategies for efficient data mining from targeted databases and literatures. Subsequently, we present a concise overview of the diverse applications of these algorithms, emphasizing their transformative impact in both fields. Finally, this review explores the future prospects and challenges of these emerging algorithms.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100035"},"PeriodicalIF":0.0,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000350/pdfft?md5=f73da155cd3c387fc723aa1852c198dc&pid=1-s2.0-S2949747723000350-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138838795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jin Xiao , YiXiao Chen , LinFeng Zhang , Han Wang , Tong Zhu
{"title":"A machine learning-based high-precision density functional method for drug-like molecules","authors":"Jin Xiao , YiXiao Chen , LinFeng Zhang , Han Wang , Tong Zhu","doi":"10.1016/j.aichem.2023.100037","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100037","url":null,"abstract":"<div><p>In computer-aided drug discovery, accurately determining the structure and properties of drug-like molecules is of utmost importance. This necessitates the use of precise and efficient electronic structure methods. Here, we developed two deep learning-based density functional methods, namely DeePHF and DeePKS, specifically tailored for drug-like molecules. Notably, DeePKS incorporates self-consistency into its framework. With a limited dataset labelled at the CCSD(T)/def2-TZVP level, both models have been able to achieve chemical accuracy in calculating molecular energies and have demonstrated excellent transferability. We anticipate that further advancements in this field will lead to the development of high-quality density functional methods designed specifically for drug discovery purposes. This research showcases the capabilities of deep learning approaches in simplifying the construction complexity associated with traditional DFT methods.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100037"},"PeriodicalIF":0.0,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000374/pdfft?md5=75400cd611ac51291405e572faae390a&pid=1-s2.0-S2949747723000374-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138769494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combining state-of-the-art quantum chemistry and machine learning make gold standard potential energy surfaces accessible for medium-sized molecules","authors":"Apurba Nandi , Péter R. Nagy","doi":"10.1016/j.aichem.2023.100036","DOIUrl":"10.1016/j.aichem.2023.100036","url":null,"abstract":"<div><p>Developing full-dimensional machine-learned potentials with the current “gold-standard” coupled-cluster (CC) level is challenging for medium-sized molecules due to the high computational cost. Consequently, researchers are often bound to use lower-level electronic structure methods such as density functional theory or second-order Møller–Plesset perturbation theory (MP2). Here, we demonstrate on a representative example that gold-standard potentials can now be effectively constructed for molecules of 15 atoms using off-the-shelf hardware. This is achieved by accelerating the CCSD(T) computations via the accurate and cost-effective frozen natural orbital (FNO) approach. The Δ-machine learning (Δ-ML) approach is employed with the use of permutationally invariant polynomials to fit a full-dimensional potential energy surface of the acetylacetone molecule, but any other effective descriptor and ML approach can similarly benefit from the accelerated data generation proposed here. Our benchmarks for the global minima, H-transfer TS, and many high-lying configurations show the excellent agreement of FNO-CCSD(T) results with conventional CCSD(T) while achieving a significant time advantage of about a factor of 30–40. The obtained Δ-ML PES shows high fidelity from multiple perspectives including energetic, structural, and vibrational properties. We obtain the symmetric double well H-transfer barrier of 3.15 kcal/mol in excellent agreement with the direct FNO-CCSD(T) barrier of 3.11 kcal/mol as well as with the benchmark CCSD(F12*)(T+)/CBS value of 3.21 kcal/mol. Furthermore, the tunneling splitting due to H-atom transfer is calculated using a 1D double-well potential, providing improved estimates over previous ones obtained using an MP2-based PES. The methodology introduced here represents a significant advancement in the efficient and precise construction of potentials at the CCSD(T) level for molecules above the current limit of 15 atoms.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100036"},"PeriodicalIF":0.0,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000362/pdfft?md5=c6666f5fcbc3a2bf27c6aae23a604aaf&pid=1-s2.0-S2949747723000362-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138991782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sugata Goswami , Silvan Käser , Raymond J. Bemish , Markus Meuwly
{"title":"Effects of aleatoric and epistemic errors in reference data on the learnability and quality of NN-based potential energy surfaces","authors":"Sugata Goswami , Silvan Käser , Raymond J. Bemish , Markus Meuwly","doi":"10.1016/j.aichem.2023.100033","DOIUrl":"10.1016/j.aichem.2023.100033","url":null,"abstract":"<div><p>The effect of noise in the input data for learning potential energy surfaces (PESs) based on neural networks for chemical applications is assessed. Noise in energies and forces can result from aleatoric and epistemic errors in the quantum chemical reference calculations. Statistical (aleatoric) noise arises for example due to the need to set convergence thresholds in the self consistent field (SCF) iterations whereas systematic (epistemic) noise is due to, <em>i</em>nter alia, particular choices of basis sets in the calculations. The two molecules considered here as proxies are H<sub>2</sub>CO and HONO which are examples for single- and multi-reference problems, respectively, for geometries around the minimum energy structure. For H<sub>2</sub>CO it is found that adding noise to energies and forces with magnitudes representative of single-point calculations does not deteriorate the quality of the final PESs whereas increasing the noise level commensurate with electronic structure calculations for more complicated, e.g. metal-containing, systems is expected to have a more notable effect. On the other hand, for HONO which requires a multi-reference treatment, a clear correlation between model quality and the degree of multi-reference character as measured by the <em>T</em><sub>1</sub> amplitude is found. It is concluded that for chemically “simple” cases the effect of aleatoric and epistemic errors is manageable without evident deterioration of the trained model, but more care needs to be exercised for situations in which multi-reference effects are present.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"2 1","pages":"Article 100033"},"PeriodicalIF":0.0,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000337/pdfft?md5=391098ccf3759b129948054b61d9af08&pid=1-s2.0-S2949747723000337-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138611496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}