{"title":"Robust design strategy using a scaffold based Turing machine model--- Application to PDI based dyes","authors":"Feng Wang , Vladislav Vasilyev","doi":"10.1016/j.aichem.2023.100023","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100023","url":null,"abstract":"<div><p>This study turns the design and screen of new compounds into a computer integer crunch of the control arrays using a scaffold based Turing machine model. If small organic fragments are stored in a fragment database (FDB) in which each fragment is labelled by an integer in an array, the position and frequency of the integer control how the fragment clicks on a scaffold (template compound). This method can robustly screen a large number of candidate fragments for solar cells and other applications such as drug design with minimal human assistance. As a proof of concept, we consider terminal imide substituents on the core perylene diimide (PDI) to develop PDI derivatives capable of absorbing UV–vis light for solar cell applications. Time dependent-density functional theory (TD-DFT) method was employed in the calculations. When the imide substituents are electron donors such as azobenzene (DPI-7), they produce a larger bathochromic shift (Δλ<sub>max</sub>) from the core DPI band position. The UV–vis absorption transitions of these DPI derivatives have more charge transfer (CT) character, as the highest occupied molecular orbitals (HOMO) are located on the fragments rather than the core DPI region. Our study presents a robust and efficient high-performance organic dye screen design strategy, and further research in DPI-based solar cell design will focus on promoting the HOMO to LUMO transitions of the optical spectra.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100023"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000234/pdfft?md5=b6b1b440208372f0df0d3764b52bd55d&pid=1-s2.0-S2949747723000234-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134657401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Metaheuristic optimisation of Gaussian process regression model hyperparameters: Insights from FEREBUS","authors":"Bienfait K. Isamura, Paul L.A. Popelier","doi":"10.1016/j.aichem.2023.100021","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100021","url":null,"abstract":"<div><p>FEREBUS is a Gaussian process regression (GPR) engine embedded in the large machinery of FFLUX, a novel machine learnt force field developed from scratch through several well-documented proof-of-concept studies. This package relies on the exploration and exploitation capabilities of metaheuristic algorithms (MAs) to carry out the global optimisation of GPR model hyperparameters (<span><math><mi>θ</mi></math></span>). However, because MAs employ different search mechanisms to scrutinise the hyperparameter space, their performance on a specific optimisation task can vary a lot from one technique to another. Herein, we report a series of carefully designed experiments aimed at evaluating the ability of ten metaheuristic algorithms to locate the optimal set of <span><math><mi>θ</mi></math></span> values. Selected optimisation techniques belong to four popular families of MAs, namely particle swarm optimisation (4), grey wolf optimisation (2), bat (2) and firefly (2) algorithms. Our calculations suggest that grey wolf optimisers (GWOs) achieve the best results on average. Furthermore, the RMSE(<span><math><mi>θ</mi></math></span>) cost function is confirmed to be an excellent guide for the selection of atomic GPR models. This work also briefly introduces an enhanced grey wolf optimiser called GWO-RUHL (Random Update of the Hierarchy Ladder), which accounts for the (so far omitted) natural desire of non-leader wolves to occupy high-ranked leadership positions in the pack. We demonstrate that GWO-RUHL achieves better results than the standard GWO in terms of both convergence speed and quality of solutions.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100021"},"PeriodicalIF":0.0,"publicationDate":"2023-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000210/pdfft?md5=b3d2985c50bf91347418f158a01005cc&pid=1-s2.0-S2949747723000210-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92061992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine learning approaches for the identification of ligands of the autophagy marker LC3","authors":"Laurent Soulère, Yves Queneau","doi":"10.1016/j.aichem.2023.100022","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100022","url":null,"abstract":"<div><p>The LC3 proteins play a crucial role in autophagy by participating to the formation of the autophagosome. Modulation of autophagy by molecular interference with LC3 proteins could help to understand this complex fundamental biological process and how it is involved in several pathologies. Identifying new LC3 ligands is a useful contribution to this aim. In the present study, we created a PubChem library of 749 compounds having a structure based on the central scaffold of novobiocin, a reported LC3A ligand. A robust, rapid and exhaustive algorithm was used for docking each compound of this database as ligands within the dihydronovobiocin binding site, providing a docking score. Remarkable reliability and consistency between docking scores and the reported binding efficiencies of known ligands was observed, validating the machine leaning protocol used in this study. Investigation of the binding mode of the ligands having the best docking score provides additional insights in possible mode of actions of the LC3 identified ligands.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100022"},"PeriodicalIF":0.0,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000222/pdfft?md5=535de2ec95e92e677368af743f018ee2&pid=1-s2.0-S2949747723000222-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91987333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikolai Schapin , Maciej Majewski , Alejandro Varela-Rial , Carlos Arroniz , Gianni De Fabritiis
{"title":"Machine learning small molecule properties in drug discovery","authors":"Nikolai Schapin , Maciej Majewski , Alejandro Varela-Rial , Carlos Arroniz , Gianni De Fabritiis","doi":"10.1016/j.aichem.2023.100020","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100020","url":null,"abstract":"<div><p>Machine learning (ML) is a promising approach for predicting small molecule properties in drug discovery. Here, we provide a comprehensive overview of various ML methods introduced for this purpose in recent years. We review a wide range of properties, including binding affinities, solubility, and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity). We discuss existing popular datasets and molecular descriptors and embeddings, such as chemical fingerprints and graph-based neural networks. We highlight also challenges of predicting and optimizing multiple properties during hit-to-lead and lead optimization stages of drug discovery and explore briefly possible multi-objective optimization techniques that can be used to balance diverse properties while optimizing lead candidates. Finally, techniques to provide an understanding of model predictions, especially for critical decision-making in drug discovery are assessed. Overall, this review provides insights into the landscape of ML models for small molecule property predictions in drug discovery. So far, there are multiple diverse approaches, but their performances are often comparable. Neural networks, while more flexible, do not always outperform simpler models. This shows that the availability of high-quality training data remains crucial for training accurate models and there is a need for standardized benchmarks, additional performance metrics, and best practices to enable richer comparisons between the different techniques and models that can shed a better light on the differences between the many techniques.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100020"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000209/pdfft?md5=3bda0f36e8c7232bba9ee7512ab052fa&pid=1-s2.0-S2949747723000209-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91987331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An accurate full-dimensional interaction potential energy surface of CO2+N2 incorporating ∆-machine learning approach via permutation invariant polynomial-neural network","authors":"Jia Li, Jun Li","doi":"10.1016/j.aichem.2023.100019","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100019","url":null,"abstract":"<div><p>The interaction between CO<sub>2</sub> and N<sub>2</sub>, both as essential components of the Earth’s atmosphere, plays a crucial role in investigating the greenhouse effect. In this work, we sampled 40,930 data points within the full-dimensional configuration space of CO<sub>2</sub> and N<sub>2</sub> and performed calculations at the level of explicitly correlated coupled cluster single, double, and perturbative triple level with the augmented correlation corrected valence triple-ζ basis set (CCSD(T)-F12a/AVTZ). To ensure computational accuracy while reducing computational costs, we employed the recently proposed Δ-machine learning (Δ-ML) method based on Permutation Invariant Polynomial-Neural Network (PIP-NN) for basis set superposition error (BSSE) correction. By leveraging the limited extrapolation capability of NN, efficient sampling was performed within the existing dataset, enabling the construction of the potential energy surface (PES) incorporating BSSE correction with only a small number of data points for BSSE calculations. A total of approximately 1100 data points were selected from the initial dataset to construct a BSSE correction PES. Utilizing this correction PES, BSSE predictions were carried out for all remaining data points, resulting in the successful development of a high-precision full-dimensional PES with BSSE correction for the CO<sub>2</sub> + N<sub>2</sub> system. The PIP-NN based Δ-ML method significantly reduced the required BSSE calculations by approximately 97.2%, resulting in a final PES with a fitting error of merely 0.026 kcal/mol.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100019"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000192/pdfft?md5=4f0503b66010517c20f46da9e39da648&pid=1-s2.0-S2949747723000192-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92061993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Balancing Wigner sampling and geometry interpolation for deep neural networks learning photochemical reactions","authors":"Li Wang, Zhendong Li, Jingbai Li","doi":"10.1016/j.aichem.2023.100018","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100018","url":null,"abstract":"<div><p>Machine learning photodynamics simulations are revolutionary tools to resolve elusive photochemical reaction mechanisms with time-dependent high-fidelity structure information. Besides the recent advances in neural networks (NNs) potentials, it still lacks a general rule for designing training data for learning photochemical reaction mechanisms with Wigner sampling and geometry interpolation. We present an in-depth investigation of the relationship between the accuracy of the multiple layer NNs and the combinations of training data based on the Wigner sampling and geometry interpolation using model photochemical reactions of the [3]-ladderdiene systems. The NNs trained with Wigner sampling data show underfitting, where the NN errors increase with the structural complexity and diversity. The NNs trained with composite Wigner sampling and geometry interpolation data show one magnitude reduced errors, suggesting an essential role of geometry interpolation in facilitating NNs learning the potential energy surfaces. However, increasing the interpolation steps results in overfitting if the Wigner sampled configuration space is narrowed. Correlating the mean absolute errors (MAE) of the NN predicted energies for the sampled and out-of-sample structures shows an optimal combination ratio of 100:10 between the Wigner sampling structures and geometry interpolation steps for 1000 training data, where the MAE of the sampled structures achieve chemical accuracy while the MAE of the out-of-sample structures is minimized. The NNs trained with the optimally combined data can detect the out-of-sample structures in adaptive sampling with a positive correlation between the maximum standard deviation and MAE of the predicted energies. Collectively, our findings suggest a general rule for designing the training data for ML photodynamics.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100018"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000180/pdfft?md5=2cdb8ecc2616508d396111c8c149852d&pid=1-s2.0-S2949747723000180-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92047094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Biswas , F.A. Gianturco , K. Giri , L. González-Sánchez , U. Lourderaj , N. Sathyamurthy , E. Yurtsever
{"title":"An improved artificial neural network fit of the ab initio potential energy surface points for HeH+ + H2 and its ensuing rigid rotors quantum dynamics","authors":"R. Biswas , F.A. Gianturco , K. Giri , L. González-Sánchez , U. Lourderaj , N. Sathyamurthy , E. Yurtsever","doi":"10.1016/j.aichem.2023.100017","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100017","url":null,"abstract":"<div><p>Artificial neural networks (ANN) have been shown for the last several years to be a versatile tool for fitting <em>ab initio</em> potential energy surfaces. We have demonstrated recently how a 60-neuron ANN could successfully fit a four-dimensional <em>ab initio</em> potential energy surface for the rigid rotor HeH<sup>+</sup> - rigid rotor H<sub>2</sub> system with a root-mean-squared deviation (RMSD) of 35 cm<sup>−1</sup>. We show in the present study how a (40, 40) neural network with two hidden layers could achieve a better fit with an RMSD of 5 cm<sup>−1</sup>. Through a follow-up quantum dynamical study of HeH<sup>+</sup>(<span><math><msub><mrow><mi>j</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>)-H<sub>2</sub>(<span><math><msub><mrow><mi>j</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>) collisions, it is shown that the two fits lead to slightly different rotational excitation and de-excitation cross sections but are comparable to each other in terms of magnitude and dependence on the relative translational energy of the collision partners. When averaged over relative translational energy, the two sets of results lead to rate coefficients that are nearly indistinguishable at higher temperatures thus demonstrating the reliability of the ANN method for fitting <em>ab initio</em> potential energy surfaces. On the other hand, we also find that the de-excitation rate coefficients obtained using the two different ANN fits differ significantly from each other at low temperatures. The consequences of these findings are discussed in our conclusions.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100017"},"PeriodicalIF":0.0,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49744254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaogang Cheng , Shiyuan Zhu , Zhaocheng Wang , Chenxin Wang , Xin Chen , Qin Zhu , Linghai Xie
{"title":"Intelligent vision for the detection of chemistry glassware toward AI robotic chemists","authors":"Xiaogang Cheng , Shiyuan Zhu , Zhaocheng Wang , Chenxin Wang , Xin Chen , Qin Zhu , Linghai Xie","doi":"10.1016/j.aichem.2023.100016","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100016","url":null,"abstract":"<div><p>One of the key steps to make an artificially intelligent (AI) and robotic chemist is the introduction of machine vision for guiding the experiment operation in the AI-redefined laboratory. In order to realize the targets, the prerequisites are to innovate/implement the intelligent vision for the detection of chemistry glassware. Here, we reported a computer vision method based on You only look once (YOLO) with a self-developed Chemical Vessel Identification Dataset (CViG) for the improvement of classification and recognition performance. The training dataset has been collected that includes 4072 images in real-time chemical laboratory. Three models, YOLOv5s, Slim-YOLOv5s and YOLOv7, have been exploited for the recognition of seven types of glassware in the condition of different scenarios (recognition distance, light and dark, stationary and moving). The improved Slim-YOLOv5s exhibited better recognition ability in various scenes, and the recognition accuracy of chemical vessels is improved by 1.51 % compared with YOLOv5s, and the size of the model is reduced from 14.4 MB to 11.0 MB. Slim-YOLOv5s's mAP is similar to YOLOv7's ability with a disadvantage of large volume, suggested that the improved Slim-YOLOv5s clearly has more advantages in terms of embedded requirements. This vision-assisted system capable of classifying chemical containers accurately in the scenarios of real-time chemical experiments will provide a good vision solution in the frontier fields of automated machine chemistry.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100016"},"PeriodicalIF":0.0,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49744252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of potential antiviral lead inhibitors against SARS-CoV-2 main protease: Structure-guided virtual screening, docking, ADME, and MD Simulation based approach","authors":"Goverdhan Lanka , Revanth Bathula , Balaram Ghosh , Sarita Rajender Potlapally","doi":"10.1016/j.aichem.2023.100015","DOIUrl":"https://doi.org/10.1016/j.aichem.2023.100015","url":null,"abstract":"<div><p>The novel coronavirus disease (COVID-19) was caused by a new strain of the virus SARS-CoV-2 in December 2019 emerged as deadly pandemic that affected millions of people worldwide. Factors such as lack of effective drugs, vaccine resistance, gene mutations, and cost of repurposed drugs demand new potential inhibitors. The main protease (Mpro) of SARS-CoV-2 has a key role in viral replication and transcription and is considered as drug target for new lead identification. In this present work, structure-based virtual screening, docking, MM/GBSA, AutoDock, ADME, and MD simulations-based optimization was proposed for the identification of new potential inhibitors against Mpro of SARS-CoV-2. The ligand molecules M1, M3, and M6 were identified as potential leads from lead optimization. Induced fit docking was performed for the identification of the best poses of lead molecules. The best docked poses of potential leads M1 and M3 were subject to 100 ns MD simulations for the evaluation of stability and interaction analysis into Mpro active site. The structures of the top two leads M1 and M3 were optimized based on MD simulation conformational changes and isoster scanning, designed as new leads M7 and M8. The MD simulation trajectories RMSD, RMSF, protein-ligand, ligand-protein interaction plots, and ligand torsion profiles were analyzed for stability interpretation. The docked complexes of M7 and M8 of Mpro exhibited equilibrated and converged plots in 100 ns simulation. The lead molecules M1, M3, M7, and M8 were identified as potential SARS-CoV-2 inhibitors for COVID-19 disease. A comparative docking study was carried out using FDA-approved drugs to support the potential binding affinities of newly identified lead inhibitors.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"1 2","pages":"Article 100015"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49744906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of potential antiviral lead inhibitors against SARS-CoV-2 main protease: Structure-guided virtual screening, docking, ADME, and MD Simulation based approach","authors":"G. Lanka, R. Bathula, B. Ghosh, S. R. Potlapally","doi":"10.2139/ssrn.4457340","DOIUrl":"https://doi.org/10.2139/ssrn.4457340","url":null,"abstract":"","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139344913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}