Anu Nagarajan, Katherine Amberg-Johnson, Evan Paull, Kunling Huang, Phani Ghanakota, Asela Chandrasinghe, Jackson Chief Elk, Jared M Sampson, Lingle Wang, Robert Abel, Steven K Albanese
{"title":"Predicting Resistance to Small Molecule Kinase Inhibitors.","authors":"Anu Nagarajan, Katherine Amberg-Johnson, Evan Paull, Kunling Huang, Phani Ghanakota, Asela Chandrasinghe, Jackson Chief Elk, Jared M Sampson, Lingle Wang, Robert Abel, Steven K Albanese","doi":"10.1021/acs.jcim.4c02313","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02313","url":null,"abstract":"<p><p>Drug resistance is a critical challenge in treating diseases like cancer and infectious disease. This study presents a novel computational workflow for predicting on-target resistance mutations to small molecule inhibitors (SMIs). The approach integrates genetic models with alchemical free energy perturbation (FEP+) calculations to identify likely resistance mutations. Specifically, a genetic model, RECODE, leverages cancer-specific mutation patterns to prioritize probable amino acid changes. Physics-based calculations assess the impact of these mutations on protein stability, endogenous substrate binding, and inhibitor binding. We apply this approach retrospectively to gefitinib and osimertinib, two clinical epidermal growth factor receptor (EGFR) inhibitors used to treat nonsmall cell lung cancer (NSCLC). Among hundreds of possible mutations, the pipeline accurately predicted 4 out of 11 and 7 out of 19 known binding site mutations for gefitinib and osimertinib, respectively, including the clinically relevant T790M and C797S resistance mutations. This study demonstrates the potential of integrating genetic models and physics-based calculations to predict SMI resistance mutations. This approach can be applied to other kinases and target classes, potentially enabling the design of next-generation inhibitors with improved durability of response in patients.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143466459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bruno Di Geronimo, Špela Mandl, Santiago Alonso-Gil, Bojan Žagrović, Gilbert Reibnegger, Christoph Nusshold, Pedro A Sánchez-Murcia
{"title":"Digging out the Molecular Connections between the Catalytic Mechanism of Human Lysosomal α-Mannosidase and Its Pathophysiology.","authors":"Bruno Di Geronimo, Špela Mandl, Santiago Alonso-Gil, Bojan Žagrović, Gilbert Reibnegger, Christoph Nusshold, Pedro A Sánchez-Murcia","doi":"10.1021/acs.jcim.4c02229","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02229","url":null,"abstract":"<p><p>Human lysosomal α-mannosidase (hLAMAN) is a paradigmatic example of how a few missense mutations can critically affect normal catabolism in the lysosome and cause the severe condition named α-mannosidosis. Here, using extensive quantum mechanical/molecular mechanical metadynamics calculations, we show how four reported pathological orthosteric and allosteric single-point mutations alter substrate puckering in the Michaelis complex and how the D74E mutation doubles the energy barrier of the rate-limiting step compared to the wild-type enzyme.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143456232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bowen Li, Jin Xiao, Ya Gao, John Z H Zhang, Tong Zhu
{"title":"Transition State Searching Accelerated by Neural Network Potential.","authors":"Bowen Li, Jin Xiao, Ya Gao, John Z H Zhang, Tong Zhu","doi":"10.1021/acs.jcim.4c01714","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01714","url":null,"abstract":"<p><p>Understanding transition states is pivotal in the design of efficient chemical processes and catalysts. However, identifying transition states is challenging due to the resource-intensive and iterative nature of current computational methods. This study integrates neural network potentials with physical models to enhance the transition state prediction. Different neural network potentials and transition states locating algorithms are benchmarked. By combining NequIP with the energy-weighted Climbing Image-Nudged Elastic Band (EW-CI-NEB) method, we achieved highly accurate transition state predictions, significantly surpassing semiempirical methods in accuracy and greatly outpacing density functional theory in efficiency. Additionally, the transferability of the model was evaluated using a NequIP model trained on a refined subset of the dataset, and the model's performance was further improved through active learning. This method can directly search for transition states in given reactions or serve as an efficient tool for generating initial guesses of transition state structures, significantly reducing manual effort.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143466464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elton J F Chaves, João Sartori, Whendel M Santos, Carlos H B Cruz, Emmanuel N Mhrous, Manassés F Nacimento-Filho, Matheus V F Ferraz, Roberto D Lins
{"title":"Estimating Absolute Protein-Protein Binding Free Energies by a Super Learner Model.","authors":"Elton J F Chaves, João Sartori, Whendel M Santos, Carlos H B Cruz, Emmanuel N Mhrous, Manassés F Nacimento-Filho, Matheus V F Ferraz, Roberto D Lins","doi":"10.1021/acs.jcim.4c01641","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01641","url":null,"abstract":"<p><p>Protein-protein binding is central to most biochemical processes of all living beings. Its importance underlies mechanisms ranging from cell interactions to metabolic control, but also to <i>ex vivo</i> biotechnology, such as the development of therapeutic monoclonal antibodies, the engineering of enzymes for industrial biocatalysis, the development of biosensors for disease detection, and the assembly of artificial protein complexes for drug screening. Therefore, predicting the strength of their association allows for understanding the molecular mechanisms and ultimately controlling them. We devised a machine learning ensemble model that uses Rosetta-based quantities to predict binding free energies of protein-protein complexes with accuracy rivaling both computationally demanding methods and currently available ML/DL tools. The method was encoded into an application Python pipeline named PBEE, which stands for Protein Binding Energy Estimator, allowing a rapid calculation of the absolute binding free energies of protein complexes from their PDB coordinates.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143456233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IRPCA: An Interpretable Robust Principal Component Analysis Framework for Inferring miRNA-Drug Associations.","authors":"Yunyin Li, Shudong Wang, Yuanyuan Zhang, Chuanru Ren, Tiyao Liu, Yingye Liu, Shanchen Pang","doi":"10.1021/acs.jcim.4c02385","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02385","url":null,"abstract":"<p><p>Recent evidence indicates that microribonucleic acids (miRNAs) are crucial in modulating drug sensitivity by orchestrating the expression of genes involved in drug metabolism and its pharmacological effects. Existing predictive methods struggle to extract features related to miRNAs and drugs, often overlooking the significance of data noise and the limitations of using a single similarity measure. To address these limitations, we propose an interpretable robust principal component analysis framework (IRPCA). IRPCA enhances the robustness of the model by employing a nonconvex low-rank approximation, thereby offering greater flexibility. Interpretability is ensured by analyzing low-rank matrix decomposition, which clarifies how miRNAs interact with drugs. Gaussian interaction profile kernel (GIPK) similarities are introduced to compute integrated similarities between miRNAs and drugs, addressing the issue of the single similarity feature. IRPCA is subsequently utilized to extract pertinent features, and a fully connected neural network is employed to generate the ultimate prediction scores. To assess the efficacy of IRPCA, we implemented 5-fold cross-validation (CV), which outperformed other leading methods, achieving the highest area under the curve (AUC) value of 0.9653. Additionally, case studies provide additional evidence supporting the efficacy of IRPCA.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143466450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andreas Vitalis, Steffen Winkler, Yang Zhang, Julian Widmer, Amedeo Caflisch
{"title":"A FAIR-Compliant Management Solution for Molecular Simulation Trajectories.","authors":"Andreas Vitalis, Steffen Winkler, Yang Zhang, Julian Widmer, Amedeo Caflisch","doi":"10.1021/acs.jcim.4c01301","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01301","url":null,"abstract":"<p><p>Simulation studies of molecules primarily produce data that represent the configuration of the system as a function of the progress variable, usually time. Because of the high-dimensional nature of these data, which grow very quickly, compromises are often necessary and achieved by storing only a subset of the system's components, for example, stripping solvent, and by restricting the time resolution to a scale significantly coarser than the basic time step of the simulation. The resultant trajectories thus describe the essentially stochastic evolution of the molecules of interest. Maintaining their interpretability through metadata is of interest not only because they can aid researchers interested in specific systems but also for reproducibility studies and model refinement. Here, we introduce a standard for the storage of data created by molecular simulations that improves compliance with the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. We describe a solution conceived in PostgreSQL, along with reference implementations, that provides stringent links between metadata and raw data, which is a major weakness of the established file formats used for storing these data. A possible structure for the logic of SQL queries is included along with salient performance testing. To close, we suggest that a PostgreSQL-based storage of simulation data, in particular when coupled to a visual user interface, can improve the FAIR compliance of molecular simulation data at all levels of visibility, and a prototype solution for accomplishing this is presented.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143466476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-Throughput Optimization of a High-Pressure Catalytic Reaction.","authors":"Yusuke Tanabe, Hiroki Sugisawa, Tomohisa Miyazawa, Kazuhiro Hotta, Kazuya Shiratori, Tadahiro Fujitani","doi":"10.1021/acs.jcim.4c02273","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02273","url":null,"abstract":"<p><p>High-throughput optimization of a hydroformylation reaction using CO<sub>2</sub> instead of CO was performed through Bayesian optimization in combination with a high-throughput screening system. CO<sub>2</sub> and H<sub>2</sub> pressure as well as catalyst composition were efficiently optimized by transferring a surrogate model, constructed through catalyst composition optimization, for the comprehensive optimization of the entire search space. This method successfully increased the aldehyde yield by 1.5 times compared to that reported in the literature with a combination of small amounts of Rh and Ru catalysts combined with ionic liquid with chloride ions. The optimization was completed within 1-2 months through the combination of AI, robotics, and human expertise, demonstrating the feasibility of rapid catalyst development, even for high-pressure reactions.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143466485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aoyun Geng, Zhenjie Luo, Aohan Li, Zilong Zhang, Quan Zou, Leyi Wei, Feifei Cui
{"title":"ACP-CLB: An Anticancer Peptide Prediction Model Based on Multichannel Discriminative Processing and Integration of Large Pretrained Protein Language Models.","authors":"Aoyun Geng, Zhenjie Luo, Aohan Li, Zilong Zhang, Quan Zou, Leyi Wei, Feifei Cui","doi":"10.1021/acs.jcim.4c02072","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02072","url":null,"abstract":"<p><strong>Motivation: </strong>Cancer affects millions globally, and as research advances, our understanding and treatment of cancer evolve. Compared to conventional treatments with significant side effects, anticancer peptides (ACPs) have gained considerable attention. Validating ACPs through wet-lab experiments is time-consuming and costly. However, numerous artificial intelligence methods are now used for ACP identification and classification. These methods typically apply a uniform strategy to all feature types, overlooking the potential benefits of more specialized processing for different feature types.</p><p><strong>Innovation: </strong>In this paper, we propose a framework based on multichannel discriminative processing, where different neural networks are applied to process various feature types, optimizing their respective feature vectors. Additionally, we leverage Large Pretrained Protein Language Models to capture deeper sequence features, further enhancing the model's performance. Contributions: To better validate the overall performance and generalization ability of the model, we compared it with state-of-the-art models using four different data sets (AntiCp2Main, AntiCp2 Alternate, ACP740, cACP-DeepGram). The results show significant improvements across most metrics. Additionally, our proposed framework better assists researchers in distinguishing and identifying ACPs and further validates the need for distinct processing methods for different feature types.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143447324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cheng-Che Chuang, Yu-Chen Liu, Wei-En Jhang, Sin-Siang Wei, Yu-Yen Ou
{"title":"RAG_MCNNIL6: A Retrieval-Augmented Multi-Window Convolutional Network for Accurate Prediction of IL-6 Inducing Epitopes.","authors":"Cheng-Che Chuang, Yu-Chen Liu, Wei-En Jhang, Sin-Siang Wei, Yu-Yen Ou","doi":"10.1021/acs.jcim.4c02144","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02144","url":null,"abstract":"<p><p>Interleukin-6 (IL-6) is a critical cytokine involved in immune regulation, inflammation, and the pathogenesis of various diseases, including autoimmune disorders, cancer, and the cytokine storm associated with severe COVID-19. Identifying IL-6 inducing epitopes, the short peptide fragments that trigger IL-6 production, is crucial for developing epitope-based vaccines and immunotherapies. However, traditional methods for epitope prediction often lack accuracy and efficiency. This study presents RAG_MCNNIL6, a novel deep learning framework that integrates Retrieval-augmented generation (RAG) with multiwindow convolutional neural networks (MCNNs) for accurate and rapid prediction of IL-6 inducing epitopes. RAG_MCNNIL6 leverages ProtTrans, a state-of-the-art pretrained protein language model, to generate rich embedding representations of peptide sequences. By incorporating a RAG-based similarity retrieval and embedding augmentation strategy, RAG_MCNNIL6 effectively captures both local and global sequence patterns relevant for IL-6 induction, significantly improving prediction performance compared to existing methods. We demonstrate the superior performance of RAG_MCNNIL6 on benchmark data sets, highlighting its potential for advancing research and therapeutic development for IL-6-mediated diseases.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143447352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yifan Wang, Lorenz Fleitmann, Lukas Raßpe-Lange, Niklas von der Assen, André Bardow, Kai Leonhard
{"title":"Fine-Tuning a Genetic Algorithm for CAMD: A Screening-Guided Warm Start.","authors":"Yifan Wang, Lorenz Fleitmann, Lukas Raßpe-Lange, Niklas von der Assen, André Bardow, Kai Leonhard","doi":"10.1021/acs.jcim.4c02038","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02038","url":null,"abstract":"<p><p>More sustainable chemical processes require the selection of suitable molecules, which can be supported by computer-aided molecular design (CAMD). CAMD often generates and evaluates molecular structures using genetic algorithms. However, genetic algorithms can suffer from slow convergence, and might yield suboptimal solutions. In response to these challenges, this work presents a method to fine-tune a genetic algorithm for CAMD. The proposed method builds on the COSMO-CAMD framework that utilizes a genetic algorithm for solving optimization-based molecular design problems and COSMO-RS for predicting physical properties of molecules. The key idea of the proposed method is to integrate results from a fast large-scale molecular screening into the molecular design framework through an automated fragmentation procedure. By generating a promising initial population and constructing a tailored fragment library, our method enables a targeted initialization of the genetic algorithm, referred to as warm-start. The proposed method is applied in two case studies to design solvents for extracting γ-valerolactone and phenol, respectively, from aqueous solutions. Compared to the benchmark method, the warm-started COSMO-CAMD framework achieves a 70% faster convergence, discovers 4-fold more top-performing candidate molecules, and identifies seven tailored molecular fragments, culminating in the discovery of two novel solvents specifically for the phenol case. The optimal solvent is found in all computational runs. Overall, the warm-started COSMO-CAMD framework significantly improves efficiency, effectiveness, and robustness of molecular design.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143447348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}