{"title":"PoseidonQ: A Free Machine Learning Platform for the Development, Analysis, and Validation of Efficient and Portable QSAR Models for Drug Discovery.","authors":"Muzammil Kabier, Nicola Gambacorta, Fulvio Ciriaco, Fabrizio Mastrolorito, Sunil Kumar, Bijo Mathew, Orazio Nicolotti","doi":"10.1021/acs.jcim.4c02372","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02372","url":null,"abstract":"<p><p>The advent of powerful machine learning algorithms as well as the availability of high volume of pharmacological data has given new fuel to QSAR, opening new unprecedented options for deriving highly predictive models for assisting the rationale design of new bioactive compounds, for screening and prioritizing large molecular libraries, and for repurposing new drugs toward new clinical uses. Here, we present PoseidonQ (an acronym for Personal Optimization Software for Efficient Implementation and Derivation of Online QSAR), a user-friendly software solution designed to simplify the derivation of the QSAR model for drug design and discovery. PoseidonQ incorporates 22 machine learning algorithms, 17 types of molecular fingerprints, and 208 RDKit molecular descriptors and enables the quick derivation of both regression and classification models along with a calculated and easily interpretable applicability domain. Importantly, the platform is automatically linked to the latest version of the ChEMBL database, thus providing streamlined access to large amounts of curated bioactivity data. Importantly, the user is also given the option of gathering high-quality experimental data based on customizable filtering settings. Noteworthy, PoseidonQ facilitates the deployment of trained QSAR models as web-based applications through seamless integration with Streamlit Cloud and GitHub, empowering users to share, refine, and integrate models effortlessly. Interestingly, the translation of QSAR models into web-based applications makes them free accessible, portable, and ready for screening large volumes of new data without limits. By unifying data preparation, model generation, and deployment into an intuitive workflow, PoseidonQ makes advanced QSAR modeling for drug design and discovery accessible to a wide audience of researchers irrespective of their skill levels. PoseidonQ bridges the gap between complex machine learning techniques and practical drug discovery applications, enhancing the efficiency, collaboration, and adoption of QSAR approaches in modern drug discovery programs. PoseidonQ is available for Windows and Linux (ubuntu 22.04 distro) operating systems and can be downloaded for free at https://github.com/Muzatheking12/PoseidonQ.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Philipp Döpner, Stefan Kemnitz, Mark Doerr, Lukas Schulig
{"title":"af3cli: Streamlining AlphaFold3 Input Preparation.","authors":"Philipp Döpner, Stefan Kemnitz, Mark Doerr, Lukas Schulig","doi":"10.1021/acs.jcim.5c00276","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00276","url":null,"abstract":"<p><p>With the release of AlphaFold3, modeling capabilities have expanded beyond protein structure prediction to embrace the inherent complexity of biomolecular systems, including nucleic acids, ions, small molecules, and their interactions. The increased complexity of these assemblies is reflected in the input file generation process, presenting a significant hurdle for researchers without advanced computational expertise. While AlphaFold Server comes with a user-friendly graphical user interface, it supports only a subset of the features of AlphaFold3. To address this, we present af3cli, an open-source tool designed to facilitate the generation of AlphaFold3 input files, specifically tailored to the standalone version of AlphaFold3 and its unrestricted functionality. Featuring a user-friendly command-line interface and an accompanying Python library, af3cli simplifies the input generation process while maintaining flexibility and customization, which makes af3cli especially useful for fast (automated) generation of a large number of input files since it enables direct incorporation of FASTA files, keeps track of IDs, and validates the JSON file. Through practical examples, we demonstrate its capabilities for constructing input data for diverse biological structures, ranging from simple proteins to complex systems, and demonstrate its seamless integration into both manual and automated workflows.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unified Deep Learning of Molecular and Protein Language Representations with T5ProtChem.","authors":"Thomas Kelly, Song Xia, Jieyu Lu, Yingkai Zhang","doi":"10.1021/acs.jcim.5c00051","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00051","url":null,"abstract":"<p><p>Deep learning has revolutionized difficult tasks in chemistry and biology, yet existing language models often treat these domains separately, relying on concatenated architectures and independently pretrained weights. These approaches fail to fully exploit the shared atomic foundations of molecular and protein sequences. Here, we introduce T5ProtChem, a unified model based on the T5 architecture, designed to simultaneously process molecular and protein sequences. Using a new pretraining objective, ProtiSMILES, T5ProtChem bridges the molecular and protein domains, enabling efficient, generalizable protein-chemical modeling. The model achieves a state-of-the-art performance in tasks such as binding affinity prediction and reaction prediction, while having a strong performance in protein function prediction. Additionally, it supports novel applications, including covalent binder classification and sequence-level adduct prediction. These results demonstrate the versatility of unified language models for drug discovery, protein engineering, and other interdisciplinary efforts in computational biology and chemistry.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143802015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianzhong Chen, Jian Wang, Wanchun Yang, Lu Zhao, Jing Su
{"title":"Activity Regulation and Conformation Response of Janus Kinase 3 Mediated by Phosphorylation: Exploration from Correlation Network Analysis and Markov Model.","authors":"Jianzhong Chen, Jian Wang, Wanchun Yang, Lu Zhao, Jing Su","doi":"10.1021/acs.jcim.5c00096","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00096","url":null,"abstract":"<p><p>The activity of the enzyme JAK3 is modulated by tyrosine phosphorylation, yet the underlying molecular details remain not fully understood. In this study, we employed a GaMD trajectory-based Markov model and correlation network analysis (CNA) to investigate the impact of single phosphorylation (SP) at Y980 (pY980) and double phosphorylation (DP) at Y980/Y981 (pY980/pY981) on the conformational dynamics of JAK3 bound by inhibitors IZA and MI1. The Markov model analysis indicated that both SP and DP result in fewer conformational states and significantly influence the conformational dynamics of the P-loop, αC-helix, and loop1-loop3, while maintaining the hinge region's high rigidity. The CNA findings revealed that phosphorylation alters the communication network among different structural regions of JAK3, providing a rational explanation for how phosphorylation affects the conformational dynamics of the distant P-loop and loop1-loop3. Moreover, the conformational changes mediated by SP and DP further affect the interactions between the inhibitors and the hot spots (L828, V836, E903, Y904, L905, and L956) of JAK3. This work offers valuable theoretical insights into the molecular mechanisms that regulate JAK3 activity.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francesc Sabanés Zariquiey, Stephen E Farr, Stefan Doerr, Gianni De Fabritiis
{"title":"QuantumBind-RBFE: Accurate Relative Binding Free Energy Calculations Using Neural Network Potentials.","authors":"Francesc Sabanés Zariquiey, Stephen E Farr, Stefan Doerr, Gianni De Fabritiis","doi":"10.1021/acs.jcim.5c00033","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00033","url":null,"abstract":"<p><p>Accurate prediction of protein-ligand binding affinities is crucial in drug discovery, particularly during hit-to-lead and lead optimization phases, however, limitations in ligand force fields continue to impact prediction accuracy. In this work, we validate relative binding free energy (RBFE) accuracy using neural network potentials (NNPs) for the ligands. We utilize a novel NNP model, AceFF 1.0, based on the TensorNet architecture for small molecules that broadens the applicability to diverse drug-like compounds, including all important chemical elements and supporting charged molecules. Using established benchmarks, we show overall improved accuracy and correlation in binding affinity predictions compared with GAFF2 for molecular mechanics and ANI2-x for NNPs. Slightly less accuracy but comparable correlations with OPLS4. We also show that we can run the NNP simulations at 2 fs time step, at least two times larger than previous NNP models, providing significant speed gains. The results show promise for further evolutions of free energy calculations using NNPs while demonstrating its practical use already with the current generation. The code and NNP model are publicly available for research use.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143802012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu Shee, Anton Morgunov, Haote Li, Victor S Batista
{"title":"DirectMultiStep: Direct Route Generation for Multistep Retrosynthesis.","authors":"Yu Shee, Anton Morgunov, Haote Li, Victor S Batista","doi":"10.1021/acs.jcim.4c01982","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01982","url":null,"abstract":"<p><p>Traditional computer-aided synthesis planning (CASP) methods rely on iterative single-step predictions, leading to exponential search space growth that limits efficiency and scalability. We introduce a series of transformer-based models that leverage a mixture of experts approach to directly generate multistep synthetic routes as a single string, conditionally predicting each transformation based on all preceding ones. Our DMS Explorer XL model, which requires only target compounds as input, outperforms state-of-the-art methods on the PaRoutes dataset with 1.9x and 3.1x improvements in Top-1 accuracy on the n<sub>1</sub> and n<sub>5</sub> test sets, respectively. Providing additional information, such as the desired number of steps and starting materials, enables both a reduction in model size and an increase in accuracy, highlighting the benefits of incorporating more constraints into the prediction process. The top-performing DMS-Flex (Duo) model scores 25-50% higher on Top-1 and Top-10 accuracies for both n<sub>1</sub> and n<sub>5</sub> sets. Additionally, our models successfully predict routes for the FDA-approved drugs not included in the training data, demonstrating strong generalization capabilities. While the limited diversity of the training set may affect performance on less common reaction types, our multistep-first approach presents a promising direction toward fully automated retrosynthetic planning.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143802011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonequilibrium Self-Assembly Control by the Stochastic Landscape Method.","authors":"Michael Faran, Gili Bisker","doi":"10.1021/acs.jcim.4c02366","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02366","url":null,"abstract":"<p><p>Self-assembly of building blocks is a fundamental process in nanotechnology, materials science, and biological systems, offering pathways to the formation of complex and functional structures through local interactions. However, the lack of effective error correction mechanisms often limits the efficiency and precision of assembly, particularly in systems with strong binding energies. Inspired by cellular processes and stochastic resetting, we present a closed-loop feedback control method that employs transient modulations in interaction energies, mimicking, for instance, the global effect of pH changes as nonequilibrium drives to optimize assembly outcomes in real time. By leveraging the stochastic landscape method, a framework using energy trend-based segmentation to predict self-assembly behavior, our approach dynamically analyzes the system's state and energy trends to guide control actions. We show that the transient energy modulation during kinetic trapping conditions substantially enhances assembly yields and reduces assembly times across diverse scenarios. This strategy provides a broadly applicable, data-driven framework for optimizing nonequilibrium assembly processes, with potential implications for precision manufacturing and responsive materials design, while also advancing our understanding of controlled molecular assembly in biological and synthetic contexts.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrii V Kyrsanov, Ihor V Tkachenko, Nataliya A Shtil, Konstantin S Gavrilenko, Olexandr Ye Pashenko, Duncan B Judd, Serhiy V Ryabukhin, Dmytro M Volochnyuk
{"title":"Chemistry Meets Biology: Cross-Disciplinary Evaluation of Drug Discovery Databases.","authors":"Andrii V Kyrsanov, Ihor V Tkachenko, Nataliya A Shtil, Konstantin S Gavrilenko, Olexandr Ye Pashenko, Duncan B Judd, Serhiy V Ryabukhin, Dmytro M Volochnyuk","doi":"10.1021/acs.jcim.5c00021","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00021","url":null,"abstract":"<p><p>This study evaluates the utility of key biochemical databases in drug discovery through a crowd-reviewed analysis, focusing on their application in medicinal chemistry and drug discovery. The findings reveal substantial deviations in data accessibility and reliability across these databases, with specific challenges in integrating chemical and biological information. The results highlight the urgent need for better-structured and user-friendly databases to support the ever-evolving interdisciplinary demands of drug discovery while emphasizing improvements in database design to enhance predictive modeling and efficient data processing. The authors conducted a structured survey among students and researchers to assess data coverage, search functionality, and information quality and provided recommendations and suggestions for increasing their quality.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143802010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Look What You Made Me Do: Discerning Feature for Classification of Endocrine-Disrupting Chemical Binding to Steroid Hormone Receptors.","authors":"Azam Rashidian, Sini Pitkänen, Vinicius Goncalves Maltarollo, Ulrich Schoppmeier, Ekaterina Shevchenko, Prasanthi Medarametla, Antti Poso, Jenni Küblbeck, Paavo Honkakoski, Thales Kronenberger","doi":"10.1021/acs.jcim.4c02288","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02288","url":null,"abstract":"<p><p>Exposure to metabolism-disrupting chemicals, which are a specific type of endocrine-disrupting chemical (EDC), is linked to metabolic problems such as dyslipidemia, insulin resistance, and hepatic steatosis. Steroid hormone receptors (SHRs) within the nuclear receptor superfamily are well-known targets for EDCs in reproductive tissues and, to a lesser extent, in liver. In this study, we investigated how five well-established SHR ligands and eight EDCs including pesticides, plasticizers, pharmaceuticals, flame retardants, industrial chemicals, and their metabolites affect estrogen (ERα in reproductive tissues) and glucocorticoid (GR in liver) receptors. We investigated the utility of structural molecular modeling to classify EDC binding to ERα and GR. To this end, we modeled a set of EDC binding to ER and GR using unbiased all-atom long-time scale molecular dynamics (MD) simulations and compared them against known established SHR agonists and antagonists. We systematically evaluated MD-derived variables such as protein-ligand interactions and binding energy, folding secondary structure elements, distances, and angles as relevant parameters. Our findings suggest that the well-established H12 folding and conformational angles can be discerning features for binding of EDCs to SHRs. Although SHR activation often involves changes in H12 folding and geometry, GR displayed less flexibility in this region, suggesting that protein-ligand interaction and binding energy are more relevant for its classification. We show that MD simulations combined with experimental assays can be a useful tool for studying novel EDCs by providing relevant structural features for their classification.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rishabh D Guha, Santiago Vargas, Evan Walter Clark Spotte-Smith, Alexander Rizzolo Epstein, Maxwell Venetos, Ryan Kingsbury, Mingjian Wen, Samuel M Blau, Kristin A Persson
{"title":"HEPOM: Using Graph Neural Networks for the Accelerated Predictions of Hydrolysis Free Energies in Different pH Conditions.","authors":"Rishabh D Guha, Santiago Vargas, Evan Walter Clark Spotte-Smith, Alexander Rizzolo Epstein, Maxwell Venetos, Ryan Kingsbury, Mingjian Wen, Samuel M Blau, Kristin A Persson","doi":"10.1021/acs.jcim.4c02443","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02443","url":null,"abstract":"<p><p>Hydrolysis is a fundamental family of chemical reactions where water facilitates the cleavage of bonds. The process is ubiquitous in biological and chemical systems, owing to water's remarkable versatility as a solvent. However, accurately predicting the feasibility of hydrolysis through computational techniques is a difficult task, as subtle changes in reactant structure like heteroatom substitutions or neighboring functional groups can influence the reaction outcome. Furthermore, hydrolysis is sensitive to the pH of the aqueous medium, and the same reaction can have different reaction properties at different pH conditions. In this work, we have combined reaction templates and high-throughput ab initio calculations to construct a diverse data set of hydrolysis free energies. The developed framework automatically identifies reaction centers, generates hydrolysis products, and utilizes a trained graph neural network (GNN) model to predict Δ<i>G</i> values for all potential hydrolysis reactions in a given molecule. The long-term goal of the work is to develop a data-driven, computational tool for high-throughput screening of pH-specific hydrolytic stability and the rapid prediction of reaction products, which can then be applied in a wide array of applications including chemical recycling of polymers and ion-conducting membranes for clean energy generation and storage.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143778524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}