Journal of Chemical Information and Modeling 最新文献

筛选
英文 中文
Applying Molecular Dynamics Simulations to Unveil the Anisotropic Growth Mechanism of Gold Nanorods: Advances and Perspectives.
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2025-03-24 Epub Date: 2025-02-28 DOI: 10.1021/acs.jcim.4c02009
José Adriano da Silva, Paulo Augusto Netz, Mario Roberto Meneghetti
{"title":"Applying Molecular Dynamics Simulations to Unveil the Anisotropic Growth Mechanism of Gold Nanorods: Advances and Perspectives.","authors":"José Adriano da Silva, Paulo Augusto Netz, Mario Roberto Meneghetti","doi":"10.1021/acs.jcim.4c02009","DOIUrl":"10.1021/acs.jcim.4c02009","url":null,"abstract":"<p><p>The unique properties of gold nanorods (AuNRs), combined with their relatively straightforward production, good yields, and satisfactory control over size and shape, have sparked considerable interest in their potential applications. However, the mechanism behind these particles' formation continues to be a subject of significant interest and debate. Many experimental studies have been designed and undertaken to understand how AuNRs can be produced through seed-mediated methods. In recent years, quantum mechanics and molecular dynamics simulations have added to the repertoire of tools for investigating this topic. By comparing simulations with experimental data, essential aspects of the anisotropic growth of AuNRs can be revealed. This review presents an overview of the mechanisms proposed for creating AuNRs through seed-mediated methods, grounded in both experimental and simulation studies, and also highlights some remaining gaps in our understanding of the anisotropic growth process that need further exploration.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"2730-2740"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11938275/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143530856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepConf: Leveraging ANI-ML Potentials for Exploring Local Minima with Application to Bioactive Conformations.
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2025-03-24 Epub Date: 2025-03-03 DOI: 10.1021/acs.jcim.4c02053
Omer Tayfuroglu, Irem N Zengin, M Serdar Koca, Abdulkadir Kocak
{"title":"DeepConf: Leveraging ANI-ML Potentials for Exploring Local Minima with Application to Bioactive Conformations.","authors":"Omer Tayfuroglu, Irem N Zengin, M Serdar Koca, Abdulkadir Kocak","doi":"10.1021/acs.jcim.4c02053","DOIUrl":"10.1021/acs.jcim.4c02053","url":null,"abstract":"<p><p>Here, we introduce a low energy conformer generation algorithm using ANI-ML potentials at the DFT accuracy and benchmark in reproducing bioactive conformations. We show that the method is efficient when the initial starting structure is far from equilibrium, when the ML potentials are stuck in nonsmooth regions, and when the quality of the conformers in a less conformer size is demanded. We specifically focus on conformations due to rotations around the single bonds. For the first time, we assess the performance of ANI-ML potentials using our conformer generation algorithm, DeepConf, in addition to previously reported Auto3D (<i>J. Chem. Inf. Model.</i> <b>2022</b>, <i>62</i>, 5373-5382) using the same potentials to reproduce bioactive conformations as well as providing a guideline for bioactive conformation evaluation processes. Our results show that the ANI-ML potentials can reproduce the bioactive conformations with mean value of the root-mean-square-deviation (RMSD) less than 0.5 Å, outperforming the limit of conventional methods. The code offers several features including but not limited to geometry optimization, fast conformer generations via single point energies (SPE), different minimization algorithms, different ML-potentials, or high-quality conformers in the smallest amount of ensemble sizes. It is available free of charge (documentation and test files) at https://github.com/otayfuroglu/DeepConf.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"2818-2833"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11938341/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143539495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FusionESP: Improved Enzyme-Substrate Pair Prediction by Fusing Protein and Chemical Knowledge.
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2025-03-24 Epub Date: 2025-03-04 DOI: 10.1021/acs.jcim.4c02357
Zhenjiao Du, Weimin Fu, Xiaolong Guo, Doina Caragea, Yonghui Li
{"title":"FusionESP: Improved Enzyme-Substrate Pair Prediction by Fusing Protein and Chemical Knowledge.","authors":"Zhenjiao Du, Weimin Fu, Xiaolong Guo, Doina Caragea, Yonghui Li","doi":"10.1021/acs.jcim.4c02357","DOIUrl":"10.1021/acs.jcim.4c02357","url":null,"abstract":"<p><p>To reduce the cost of the experimental characterization of the potential substrates for enzymes, machine learning prediction models offer an alternative solution. Pretrained language models, as powerful approaches for protein and molecule representation, have been employed in the development of enzyme-substrate prediction models, achieving promising performance. In addition to continuing improvements in language models, effectively fusing encoders to handle multimodal prediction tasks is critical for further enhancing model performance by using available representation methods. Here, we present FusionESP, a multimodal architecture that integrates protein and chemistry language models with two independent projection heads and a contrastive learning strategy for predicting enzyme-substrate pairs. Our best model achieved state-of-the-art performance with an accuracy of 94.77% on independent test data and exhibited better generalization capacity while requiring fewer computational resources and training data, compared to previous studies of a fine-tuned encoder or employing more encoders. It also confirmed our hypothesis that embeddings of positive pairs are closer to each other in a high-dimension space, while negative pairs exhibit the opposite trend. Our ablation studies showed that the projection heads played a crucial role in performance enhancement, while the contrastive learning strategy further improved the projection heads' capacity in classification tasks. The proposed architecture is expected to be further applied to enhance performance in additional multimodality prediction tasks in biology. A user-friendly web server of FusionESP is established and freely accessible at https://rqkjkgpsyu.us-east-1.awsapprunner.com/.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"2806-2817"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143539498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Join Persistent Homology (JPH)-Based Machine Learning for Metalloprotein-Ligand Binding Affinity Prediction.
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2025-03-24 Epub Date: 2025-03-13 DOI: 10.1021/acs.jcim.4c02309
Yaxing Wang, Xiang Liu, Yipeng Zhang, Xiangjun Wang, Kelin Xia
{"title":"Join Persistent Homology (JPH)-Based Machine Learning for Metalloprotein-Ligand Binding Affinity Prediction.","authors":"Yaxing Wang, Xiang Liu, Yipeng Zhang, Xiangjun Wang, Kelin Xia","doi":"10.1021/acs.jcim.4c02309","DOIUrl":"10.1021/acs.jcim.4c02309","url":null,"abstract":"<p><p>With the crucial role of metalloproteins in respiration, oxidative stress protection, photosynthesis, and drug metabolism, the design and discovery of drugs that can target metalloproteins are extremely important. Recently, enormous potential has been shown by topological data analysis (TDA) and TDA-based machine learning models in various steps of drug design and discovery. Here, we propose, for the first time, join persistent homology (JPH) and JPH-based machine learning models for metalloprotein-ligand binding affinity prediction. Mathematically, dramatically different from persistent homology and extended persistent homology, our JPH employs a set of filtration functions to generate a multistage filtration for the join of the original simplicial complex and a specially designed test simplicial complex. From the featurization perspective, our JPH-based molecular descriptors can provide a more comprehensive characterization of the intrinsic topological information of the data. Our JPH descriptors are combined with the gradient boosting tree (GBT) model for metalloprotein-ligand binding affinity prediction. The benchmark dataset for metalloprotein-ligand complexes from PDBbind-v2020 is employed for the validation and comparison of our model. It has been found that our JPH-GBT model can outperform all of the existing models, as far as we know. This demonstrates the great potential of our join persistent homology in the characterization of molecular structures and functions.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"2785-2793"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143622863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Transfer Learning for Predicting Protein-Small-Molecule Interaction Predictions.
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2025-03-24 DOI: 10.1021/acs.jcim.4c02256
Jian Wang, Nikolay V Dokholyan
{"title":"Leveraging Transfer Learning for Predicting Protein-Small-Molecule Interaction Predictions.","authors":"Jian Wang, Nikolay V Dokholyan","doi":"10.1021/acs.jcim.4c02256","DOIUrl":"10.1021/acs.jcim.4c02256","url":null,"abstract":"<p><p>A complex web of intermolecular interactions defines and regulates biological processes. Understanding this web has been particularly challenging because of the sheer number of actors in biological systems: ∼10<sup>4</sup> proteins in a typical human cell offer plausible 10<sup>8</sup> interactions. This number grows rapidly if we consider metabolites, drugs, nutrients, and other biological molecules. The relative strength of interactions also critically affects these biological processes. However, the small and often incomplete data sets (10<sup>3</sup>-10<sup>4</sup> protein-ligand interactions) traditionally used for binding affinity predictions limit the ability to capture the full complexity of these interactions. To overcome this challenge, we developed Yuel 2, a novel neural network-based approach that leverages transfer learning to address the limitations of small data sets. Yuel 2 is pretrained on a large-scale data set to learn intricate structural features and then fine-tuned on specialized data sets like PDBbind to enhance the predictive accuracy and robustness. We show that Yuel 2 predicts multiple binding affinity metrics, <i>K</i><sub>d</sub>, <i>K</i><sub>i</sub>, and IC<sub>50</sub>, between proteins and small molecules, offering a comprehensive representation of molecular interactions crucial for drug design and development.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143699095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FEP-SPell-ABFE: An Open-Source Automated Alchemical Absolute Binding Free-Energy Calculation Workflow for Drug Discovery.
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2025-03-24 Epub Date: 2025-03-03 DOI: 10.1021/acs.jcim.4c01986
Pengfei Li, Tingting Pu, Ye Mei
{"title":"FEP-SPell-ABFE: An Open-Source Automated Alchemical Absolute Binding Free-Energy Calculation Workflow for Drug Discovery.","authors":"Pengfei Li, Tingting Pu, Ye Mei","doi":"10.1021/acs.jcim.4c01986","DOIUrl":"10.1021/acs.jcim.4c01986","url":null,"abstract":"<p><p>The binding affinity between a drug molecule and its target, measured by the absolute binding free energy (ABFE), is a crucial factor in the lead discovery phase of drug development. Recent research has highlighted the potential of in silico ABFE predictions to directly aid drug development by allowing for the ranking and prioritization of promising candidates. This work introduces an open-source Python workflow called FEP-SPell-ABFE, designed to automate ABFE calculations with minimal user involvement. The workflow requires only three key inputs: a receptor protein structure in PDB format, candidate ligands in SDF format, and a configuration file (config.yaml) that governs both the workflow and molecular dynamics simulation parameters. It produces a ranked list of ligands along with their binding free energies in the comma-separated values (CSV) format. The workflow leverages SLURM (Simple Linux Utility for Resource Management) for automating task execution and resource allocation across the modules. A usage example and several benchmark systems for validation are provided. The FEP-SPell-ABFE workflow, along with a practical example, is publicly accessible on GitHub at https://github.com/freeenergylab/FEP-SPell-ABFE, distributed under the MIT License.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"2711-2721"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143539496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Disentangling Folding from Energetic Traps in Simulations of Disordered Proteins.
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2025-03-24 Epub Date: 2025-03-05 DOI: 10.1021/acs.jcim.4c02005
Jeffrey M Lotthammer, Alex S Holehouse
{"title":"Disentangling Folding from Energetic Traps in Simulations of Disordered Proteins.","authors":"Jeffrey M Lotthammer, Alex S Holehouse","doi":"10.1021/acs.jcim.4c02005","DOIUrl":"10.1021/acs.jcim.4c02005","url":null,"abstract":"<p><p>Protein conformational heterogeneity plays an essential role in a myriad of different biological processes. Extensive conformational heterogeneity is especially characteristic of intrinsically disordered proteins and protein regions (collectively IDRs), which lack a well-defined three-dimensional structure and instead rapidly exchange between a diverse ensemble of configurations. An emerging paradigm recognizes that the conformational biases encoded in IDR ensembles can play a central role in their biological function, necessitating understanding these sequence-ensemble relations. All-atom simulations have provided critical insight into our modern understanding of the solution behavior of IDRs. However, effectively exploring the accessible conformational space associated with large, heterogeneous ensembles is challenging. In particular, identifying poorly sampled or energetically trapped regions of disordered proteins in simulations often relies on qualitative assessment based on visual inspection of simulations and/or analysis data. These approaches, while convenient, run the risk of masking poorly sampled simulations. In this work, we present an algorithm for quantifying per-residue local conformational heterogeneity in protein simulations. Our work builds on prior work and compares the similarity between backbone dihedral angle distributions generated from molecular simulations in a limiting polymer model and across independent all-atom simulations. In this regime, the polymer model serves as a statistical reference model for extensive conformational heterogeneity in a real chain. Quantitative comparisons of probability vectors generated from these simulations reveal the extent of conformational sampling in a simulation, enabling us to distinguish between situations in which protein regions are well-sampled, poorly sampled, or folded. To demonstrate the effectiveness of this approach, we apply our algorithm to several toy, synthetic, and biological systems. Accurately assessing local conformational sampling in simulations of IDRs will help better quantify new enhanced sampling methods, ensure force field comparisons are equivalent, and provide confidence that conclusions drawn from simulations are robust.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"2897-2910"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143555328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerated Hydration Site Localization and Thermodynamic Profiling.
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2025-03-24 Epub Date: 2025-02-28 DOI: 10.1021/acs.jcim.4c02349
Florian B Hinz, Matthew R Masters, Julia T Nguyen, Amr H Mahmoud, Markus A Lill
{"title":"Accelerated Hydration Site Localization and Thermodynamic Profiling.","authors":"Florian B Hinz, Matthew R Masters, Julia T Nguyen, Amr H Mahmoud, Markus A Lill","doi":"10.1021/acs.jcim.4c02349","DOIUrl":"10.1021/acs.jcim.4c02349","url":null,"abstract":"<p><p>Water plays a fundamental role in the structure and function of proteins and other biomolecules. The thermodynamic profile of water molecules surrounding a protein is critical for ligand recognition and binding. Therefore, identifying the location and thermodynamic properties of relevant water molecules is important for generating and optimizing lead compounds for affinity and selectivity for a given target. Computational methods have been developed to identify these hydration sites (HS), but are largely limited to simplified models that fail to capture multibody interactions or dynamics-based methods that rely on extensive sampling. Here, we present a method for fast and accurate localization and thermodynamic profiling of HS for protein structures. The method is based on a geometric deep neural network trained on a large, novel data set of explicit water molecular dynamics simulations. We confirm the accuracy and robustness of our model on experimental data and demonstrate its utility on several case studies.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"2794-2805"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11938278/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143527852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Blood-Brain Barrier Penetration Prediction by Machine Learning-Based Integration of Novel and Existing, In Silico and Experimental Molecular Parameters from a Standardized Database.
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2025-03-24 Epub Date: 2025-03-04 DOI: 10.1021/acs.jcim.4c02212
Clemens P Spielvogel, Natalie Schindler, Christian Schröder, Sarah Luise Stellnberger, Wolfgang Wadsak, Markus Mitterhauser, Laszlo Papp, Marcus Hacker, Verena Pichler, Chrysoula Vraka
{"title":"Enhancing Blood-Brain Barrier Penetration Prediction by Machine Learning-Based Integration of Novel and Existing, In Silico and Experimental Molecular Parameters from a Standardized Database.","authors":"Clemens P Spielvogel, Natalie Schindler, Christian Schröder, Sarah Luise Stellnberger, Wolfgang Wadsak, Markus Mitterhauser, Laszlo Papp, Marcus Hacker, Verena Pichler, Chrysoula Vraka","doi":"10.1021/acs.jcim.4c02212","DOIUrl":"10.1021/acs.jcim.4c02212","url":null,"abstract":"&lt;p&gt;&lt;p&gt;Predicting blood-brain barrier (BBB) penetration is crucial for developing central nervous system (CNS) drugs, representing a significant hurdle in successful clinical phase I studies. One of the most valuable properties for this prediction is the polar surface area (PSA). However, molecular structures are missing geometric optimization, which, together with lack of standardization, leads to variations in calculation. Additionally, prediction rules have been established by combining different molecular properties such as the BBB score or CNS multiparameter optimization (CNS MPO). This study aims to create an approach for 3D PSA calculation, to directly apply this value in combination with a set of 23 other parameters in a novel machine learning (ML)-based scoring, and to further evaluate existing prediction models using a standardized database. We developed and analyzed a standardized data set derived from the same laboratory, encompassing 24 calculated and experimentally determined molecular parameters such as PSA from various models, HPLC log &lt;i&gt;P&lt;/i&gt; values, and hydrogen bond characteristics for 154 radiolabeled molecules and licensed or well-characterized drugs. These molecules were classified into categories based on BBB penetration, nonpenetration, and interactions with efflux transporters. We supplemented these with a novel in silico 3D calculation of nonclassical PSA. Additionally, we have calculated published prediction rules based on this standardized and transparent database. Using these data, we trained various ML models within a 100-fold Monte Carlo cross-validation framework to derive a novel ML-based prediction score for BBB penetration and validated the three most used existing predictive rules. To interpret the influence of individual molecular parameters and different existing predictive rules, we employed explainable artificial intelligence methods including Shapley additive explanations (SHAP) and surrogate modeling. The ML approach outperformed existing scores for BBB penetration by applying a complex nonlinear integration of molecular properties, with the random forest model achieving the best performance for predicting binary BBB penetration (area under the receiver operating characteristic curve (AUC) 0.88, 95% confidence intervals: 0.87-0.90), and multiclass efflux transporter versus CNS-positive and CNS-negative prediction (AUC 0.82, 95% CI: 0.81-0.82). SHAP analysis revealed the multifactorial nature of the problem, highlighting the advantage of multivariate models over single predictive parameters. The ML model's superior predictive capability was demonstrated in comparison with existing scoring systems, like the CNS MPO (AUC 0.53), the CNS MPO Positron emission tomography (PET) (AUC 0.51), and BBB score (AUC 0.68) while also enabling the identification of efflux transporter substrates and inhibitors. Our integrated ML approach, combining experimental and in silico measurements with novel in silico methods based ","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"2773-2784"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11938273/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143555331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From NMR to AI: Do We Need 1H NMR Experimental Spectra to Obtain High-Quality logD Prediction Models?
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2025-03-24 Epub Date: 2025-03-05 DOI: 10.1021/acs.jcim.4c02145
Arkadiusz Leniak, Wojciech Pietruś, Aleksandra Świderska, Rafał Kurczab
{"title":"From NMR to AI: Do We Need <sup>1</sup>H NMR Experimental Spectra to Obtain High-Quality logD Prediction Models?","authors":"Arkadiusz Leniak, Wojciech Pietruś, Aleksandra Świderska, Rafał Kurczab","doi":"10.1021/acs.jcim.4c02145","DOIUrl":"10.1021/acs.jcim.4c02145","url":null,"abstract":"<p><p>This study presents a novel approach to <sup>1</sup>H NMR-based machine learning (ML) models for predicting logD using computer-generated <sup>1</sup>H NMR spectra. Building on our previous work, which integrated experimental <sup>1</sup>H NMR data, this study addresses key limitations associated with experimental measurements, such as sample stability, solvent variability, and extensive processing, by replacing them with fully computational workflows. Benchmarking across various density functional theory (DFT) functionals and basis sets highlighted their limitations, with DFT-based models showing relatively high RMSE values (average CHI logD of 1.12, lowest at 0.96) and extensive computational demands, limiting their usefulness for large-scale predictions. In contrast, models trained on predicted <sup>1</sup>H NMR spectra by NMRshiftDB2 and JEOL JASON achieved RMSE values as low as 0.76, compared to 0.88 for experimental spectra. Further analysis revealed that mixing experimental and predicted spectra did not enhance accuracy, underscoring the advantage of homogeneous datasets. Validation with external datasets confirmed the robustness of our models, showing comparable performance to commercial software like Instant JChem, thus underscoring the reliability of the proposed computational workflow. Additionally, using normalized RMSE (NRMSE) proved essential for consistent model evaluation across datasets with varying data scales. By eliminating the need for experimental input, this workflow offers a widely accessible, computationally efficient pipeline, setting a new standard for ML-driven chemical property predictions without experimental data constraints.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"2924-2939"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11938277/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143565643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信