Minghao Liu, Kaiyu Wang, Yan Zhang, Xue Zhou, Wannan Li, Weiwei Han
{"title":"Mechanistic Study of Protein Interaction with Natto Inhibitory Peptides Targeting Xanthine Oxidase: Insights from Machine Learning and Molecular Dynamics Simulations.","authors":"Minghao Liu, Kaiyu Wang, Yan Zhang, Xue Zhou, Wannan Li, Weiwei Han","doi":"10.1021/acs.jcim.5c00126","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00126","url":null,"abstract":"<p><p>Bioactive peptides from food sources offer a safe and biocompatible approach to enzyme inhibition, with potential applications in managing metabolic disorders such as hyperuricemia and gout, conditions linked to excessive xanthine oxidase activity. Using a machine learning-based screening approach inspired by the bioactivity of natto, two peptides, ECFK and FECK, were identified from the <i>Bacillus subtilis</i> proteome and validated as xanthine oxidase inhibitors with IC<sub>50</sub> values of 37.36 and 71.57 mM, respectively. Further experiments confirmed their safety through cytotoxicity assays, and electronic tongue analysis demonstrated their mild sensory properties, supporting their edibility. Molecular dynamics simulations revealed that these peptides stabilize critical enzyme regions, with ECFK showing a higher dissociation energy barrier (52.08 kcal/mol) than FECK (46.39 kcal/mol), indicating strong, stable interactions. This study highlights food-derived peptides as safe and natural inhibitors of xanthine oxidase, offering promising therapeutic potential for metabolic disorder management.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143690479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tieu-Long Phan, Klaus Weinbauer, Marcos E González Laffitte, Yingjie Pan, Daniel Merkle, Jakob L Andersen, Rolf Fagerberg, Christoph Flamm, Peter F Stadler
{"title":"SynTemp: Efficient Extraction of Graph-Based Reaction Rules from Large-Scale Reaction Databases.","authors":"Tieu-Long Phan, Klaus Weinbauer, Marcos E González Laffitte, Yingjie Pan, Daniel Merkle, Jakob L Andersen, Rolf Fagerberg, Christoph Flamm, Peter F Stadler","doi":"10.1021/acs.jcim.4c01795","DOIUrl":"10.1021/acs.jcim.4c01795","url":null,"abstract":"<p><p>Reaction templates are graphs that represent the reaction center as well as the surrounding context in order to specify salient features of chemical reactions. They are subgraphs of <i>imaginary transition states</i>, which are equivalent to double pushout graph rewriting rules and thus can be applied directly to predict reaction outcomes at the structural formula level. We introduce here SynTemp, a framework designed to extract and hierarchically cluster reaction templates from large-scale reaction data repositories. Rule inference is implemented as a robust graph-theoretic approach, which first computes an atom-atom mapping (AAM) as a consensus over partial predictions from multiple state-of-the-art tools and then augments the raw AAM by mechanistically relevant hydrogen atoms and extracts the reactions center extended by relevant context. SynTemp achieves an exceptional accuracy of 99.5% and a success rate of 71.23% in obtaining AAMs on the <i>chemical reaction dataset</i>. Hierarchical clustering of the extended reaction centers based on topological features results in a library of 311 transformation rules explaining 86% of the reaction dataset.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"2882-2896"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11938280/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143522085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonbonded Parameter Optimization Improving Simulation of Intrinsically Disordered Phosphoproteins.","authors":"Xinyao Zheng, Ge Song, Zhengxin Li, Bowen Duan, Bofan Zhu, Hai-Feng Chen","doi":"10.1021/acs.jcim.4c02248","DOIUrl":"10.1021/acs.jcim.4c02248","url":null,"abstract":"<p><p>Phosphorylated proteins play a crucial role in numerous cellular processes, acting as key regulators in signal transduction networks, cell expansion, and various biochemical reactions. Molecular dynamics (MD) simulations are powerful tools for exploring the dynamic conformations of phosphoproteins. However, conventional force fields often underestimate the radii of gyration (Rg) of phosphoproteins. To address this limitation, we reoptimized the parameters of the vDW radius for the oxygen atom and the charge for the phosphorus atom with a reweighting algorithm and thermodynamic integration, named phosRg. Validation on test systems of seven representative phosphoproteins demonstrates that phosRg has better agreement with experiment for Rg and chemical shift than does phosaa10. Furthermore, phosRg generated more extensional conformations with fewer hydrophobic interactions and hydrogen bonds than phosaa10. At the same time, we found that the TIP4P-D solvent model is a suitable choice to be used with phosRg for the simulation of phosphorylated proteins. These results indicate that our dual-objective optimization strategy is powerful and necessary for improving the parameters. In summary, phosRg should also be used for simulations of other phosphorylated proteins.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"2974-2984"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143583828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BB-SAR: An Application for Data-driven Analysis and Rational Design of Medicinal Chemistry Series.","authors":"Florent Chevillard, Sandrine Hell, Elisa Liberatore","doi":"10.1021/acs.jcim.4c02121","DOIUrl":"10.1021/acs.jcim.4c02121","url":null,"abstract":"<p><p>In drug discovery, medicinal chemists face the challenge of generating and analyzing large data sets, often exceeding a thousand molecules and numerous physicochemical and biological properties. To address this, we introduced BB-SAR, an interpolative methodology that tackles both data complexity and interpretability, by breaking down molecules into their constituent building blocks (BBs). Establishing a direct correlation between molecules and their constituent BBs enables the association of these BBs with their respective biological and physicochemical properties. This facilitates more intuitive data analysis and enables the identification of critical trends between molecular features and their associated properties. While individual BBs rarely dictate property behavior, their combinations do. BB-SAR identifies impactful combinations for designing new, improved compounds. Additionally, it simplifies traditional medicinal chemistry analysis strategies and enhances the efficiency of drug discovery by providing a more inherent understanding of complex data sets within a concise framework.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"2845-2853"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143555326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bidirectional Long Short-Term Memory (BiLSTM) Neural Networks with Conjoint Fingerprints: Application in Predicting Skin-Sensitizing Agents in Natural Compounds.","authors":"Huynh Anh Duy, Tarapong Srisongkram","doi":"10.1021/acs.jcim.5c00032","DOIUrl":"10.1021/acs.jcim.5c00032","url":null,"abstract":"<p><p>Skin sensitization, or allergic contact dermatitis, represents a critical end point in toxicity assessment, with profound implications for drug safety and regulatory decision-making. This study aims to develop a robust deep-learning-based quantitative structure-activity relationship framework for accurately predicting skin sensitization toxicity, particularly in the context of natural-product-derived compounds. To achieve this, we explored advanced recurrent neural network architectures, including long short-term memory (LSTM), bidirectional LSTM (BiLSTM), gated recurrent unit (GRU), and bidirectional GRU, to model the intricate structure-toxicity relationships inherent in molecular compounds. We aim to optimize and improve predictive performance by training a cohort of 55 models with a diverse set of molecular fingerprints. Notably, the BiLSTM model, which integrates SMILES tokens with RDKit fingerprints, achieved superior predictive performance, underscoring its capability to effectively capture key molecular determinants of skin sensitization. An extensive applicability domain analysis coupled with an in-depth evaluation of feature importance provided new insights into the key molecular attributes that influence sensitization propensity. We further evaluated the BiLSTM model using a natural product data set, where it demonstrated exceptional generalization capabilities. The model achieved an accuracy of 86.5%, a Matthews correlation coefficient of 75.2%, a sensitivity of 100%, an area under the curve of 88%, a specificity of 75%, and an F1-score of 88.8%. Remarkably, the model effectively categorized natural products by discriminating sensitizing from non-sensitizing agents across various natural product subcategories. These results underscore the potential of BiLSTM-based models as powerful <i>in silico</i> tools for modern drug discovery efforts and regulatory assessments, especially in the field of natural products.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"3035-3047"},"PeriodicalIF":5.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11938345/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143603033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ATP-Pred: Prediction of Protein-ATP Binding Residues via Fusion of Residue-Level Embeddings and Kolmogorov-Arnold Network.","authors":"Lingrong Zhang, Taigang Liu","doi":"10.1021/acs.jcim.5c00016","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00016","url":null,"abstract":"<p><p>Accurately identifying protein-ATP binding residues is essential for understanding biological processes and designing drugs. However, current sequence-based methods have limitations, such as difficulties in extracting discriminative features and the need for more efficient algorithms. Additionally, methods based on multiple sequence alignments often face challenges in handling large-scale predictions. To address these issues, we developed ATP-Pred, a sequence-based method for predicting ATP-binding residues in proteins. This model applies transfer learning by using two recently developed pretrain protein language models, Ankh and ProstT5, to extract residue-level embeddings that capture protein functionality. ATP-Pred also integrates a CNN-BiLSTM network and a Kolmogorov-Arnold network to build the prediction model. To handle data imbalance, we introduced a weighted focal loss function. Experimental results on three independent test data sets showed that ATP-Pred outperforms most existing methods. Its generalizability was further validated on four protein-mononucleotide binding residue data sets, where it delivered promising results. These findings suggest that ATP-Pred is a robust and reliable predictor.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical Deep Potential with Structure Constraints for Efficient Coarse-Grained Modeling.","authors":"Qi Huang, Yedi Li, Lei Zhu, Wenjie Yu","doi":"10.1021/acs.jcim.4c02042","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02042","url":null,"abstract":"<p><p>Coarse-grained molecular dynamics is a powerful approach for simulating large-scale systems by reducing the number of degrees of freedom. Nonetheless, the development of accurate coarse-grained force fields remains challenging, particularly for complex systems, such as polymers. In this study, we introduce a novel framework, hierarchical deep potential with structure constraints (HDP-SC), designed to construct coarse-grained force fields for polymer materials. Our methodology integrates a prior energy term obtained through direct Boltzmann inversion with a deep neural network potential, which is trained using hierarchical bead environment descriptors. This framework facilitates the reproduction of structural distributions and the potential of mean force, thus enhancing the accuracy and efficiency of the coarse-grained model. We validate our approach using polystyrene systems, demonstrating that the HDP-SC model not only successfully reproduces the structural properties of these systems but also remains applicable at larger scales. Our findings underscore the promise of machine learning-based techniques in advancing the development of coarse-grained force fields for polymer materials.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ChiGNN: Interpretable Algorithm Framework of Molecular Chiral Knowledge-Embedding and Stereosensitive Property Prediction.","authors":"Jiaxin Yan, Haiyuan Wang, Wensheng Yang, Xiaonan Ma, Yajing Sun, Wenping Hu","doi":"10.1021/acs.jcim.4c02259","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02259","url":null,"abstract":"<p><p>Molecular chirality-related tasks have remained a notable challenge in materials machine learning (ML) due to the subtle spatial discrepancy between enantiomers. Designing appropriate steric molecular descriptions and embedding chiral knowledge are of great significance for improving the accuracy and interpretability of ML models. In this work, we propose a state-of-the-art deep learning framework, Chiral Graph Neural Network, which can effectively incorporate chiral physicochemical knowledge via Trinity Graph and stereosensitive Message Aggregation encoding. Combined with the quantile regression technique, the accuracy of the chiral chromatographic retention time prediction model outperformed the existing records. Accounting for the inherent merits of this framework, we have customized the Trinity Mask and Contribution Splitting techniques to enable a multilevel interpretation of the model's decision mechanism at atomic, functional group, and molecular hierarchy levels. This interpretation has both scientific and practical implications for the understanding of chiral chromatographic separation and the selection of chromatographic stationary phases. Moreover, the proposed chiral knowledge embedding and interpretable deep learning framework, together with the stereomolecular representation, chiral knowledge embedding method, and multilevel interpretation technique within it, also provide an extensible template and precedent for future chirality-related or stereosensitive ML tasks.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143668479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingfu Wang, Jiaying Chen, Yue Hu, Chaolin Song, Xinhui Li, Yurong Qian, Lei Deng
{"title":"DeepMFFGO: A Protein Function Prediction Method for Large-Scale Multifeature Fusion.","authors":"Jingfu Wang, Jiaying Chen, Yue Hu, Chaolin Song, Xinhui Li, Yurong Qian, Lei Deng","doi":"10.1021/acs.jcim.5c00062","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00062","url":null,"abstract":"<p><p>Protein functional studies are crucial in the fields of drug target discovery and drug design. However, the existing methods have significant bottlenecks in utilizing multisource data fusion and Gene Ontology (GO) hierarchy. To this end, this study innovatively proposes the DeepMFFGO model designed for protein function prediction under large-scale multifeature fusion. A fine-tuning strategy using intermediate-level feature selection is proposed to reduce redundancy in protein sequences and mitigate distortion of the top-level features. A hierarchical progressive fusion structure is designed to explore feature connections, optimize complementarity through dynamic weight allocation, and reduce redundant interference. On the CAFA3 data set, the <i>F</i><sub>max</sub> values of the DeepMFFGO model on the MF, BP, and CC ontologies reach 0.702, 0.599, and 0.704, respectively, which are improved by 4.2%, 2.4%, and 0.07%, respectively, compared with state-of-the-art multisource methods.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143672911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrative Computational Analysis of Common EXO5 Haplotypes: Impact on Protein Dynamics, Genome Stability, and Cancer Progression.","authors":"Fabio Mazza, Davide Dalfovo, Alessio Bartocci, Gianluca Lattanzi, Alessandro Romanel","doi":"10.1021/acs.jcim.5c00067","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00067","url":null,"abstract":"<p><p>Understanding the impact of common germline variants on protein structure, function, and disease progression is crucial in cancer research. This study presents a comprehensive analysis of the <i>EXO5</i> gene, which encodes a DNA exonuclease involved in DNA repair that was previously associated with cancer susceptibility. We employed an integrated approach combining genomic and clinical data analysis, deep learning variant effect prediction, and molecular dynamics (MD) simulations to investigate the effects of common <i>EXO5</i> haplotypes on protein structure, dynamics, and cancer outcomes. We characterized the haplotype structure of <i>EXO5</i> across diverse human populations, identifying five common haplotypes, and studied their impact on the EXO5 protein. Extensive, all-atom MD simulations revealed significant structural and dynamic differences among the EXO5 protein variants, particularly in their catalytic region. The L151P EXO5 protein variant exhibited the most substantial conformational changes, potentially disruptive for EXO5's function and nuclear localization. Analysis of The Cancer Genome Atlas data showed that cancer patients carrying L151P EXO5 had significantly shorter progression-free survival in prostate and pancreatic cancers and exhibited increased genomic instability. This study highlights the strength of our methodology in uncovering the effects of common genetic variants on protein function and their implications for disease outcomes.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143668480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}