Marcin Makowski, Octávio L Franco, Nuno C Santos, Manuel N Melo
{"title":"Lipid Shape as a Membrane Activity Modulator of a Fusogenic Antimicrobial Peptide.","authors":"Marcin Makowski, Octávio L Franco, Nuno C Santos, Manuel N Melo","doi":"10.1021/acs.jcim.4c02020","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02020","url":null,"abstract":"<p><p>An intriguing feature of many bacterial membranes is their prevalence of non-bilayer-forming lipids, such as the cone-shaped phosphatidylethanolamines and cardiolipins. Many membrane-active antimicrobial peptides lower the bilayer-to-hexagonal phase transition energy barrier in membranes containing such types of cone-shaped lipids. Here, we systematically studied how the molecular shape of lipids affects the activity of antimicrobial peptide EcDBS1R4, which is known to be an efficient fusogenic peptide. Using coarse-grained molecular dynamics simulations, we show the ability of EcDBS1R4 to form \"hourglass-shaped\" pores, which is inhibited by cone-shaped lipids. The abundance of cone-shaped lipids further correlates with the propensity of this peptide to oligomerize preferentially in antiparallel dimers. We also observe that EcDBS1R4 promotes the segregation of the anionic lipids. When coupled to dimerization, this charge segregation leads to regions in the bilayer that are devoid of peptides and rich in zwitterionic lipids. Our results indicate a protective role of cone-shaped lipids in bacterial membranes against pore-mediated permeabilization by EcDBS1R4.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143661731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced Regioselectivity Prediction of sp<sup>2</sup> C-H Halogenation via Negative Data Augmentation and Multimodel Integration.","authors":"Zhiting Zhang, Jia Qiu, Jiajun Zheng, Zhunzhun Yu, Lebin Su, Qianghua Lin, Chonghuan Zhang, Kuangbiao Liao","doi":"10.1021/acs.jcim.5c00281","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00281","url":null,"abstract":"<p><p>Efficient molecular editing is pivotal in synthetic chemistry, especially for developing drugs, materials, and high-value chemicals. Electrophilic aromatic substitution (S<sub>E</sub>Ar) reactions, specifically sp<sup>2</sup> C-H halogenation, face significant challenges due to electronic and steric factors, necessitating extensive trial-and-error. This study introduces an innovative machine learning-based model to predict halogenation sites in S<sub>E</sub>Ar reactions, achieving an average accuracy of 93% in 5-fold cross-validation. Employing ensemble techniques, particularly AutoGluon-Tabular (AG), the model demonstrates broad applicability across various aromatic halides, enhancing its utility in drug design, materials science, and more. By reducing experimental uncertainty and optimizing synthetic pathways, this model saves considerable time and resources, thereby accelerating innovation in synthetic chemistry.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143661723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Role of Cre Dynamics in Autoinhibition and Priming.","authors":"Marco A Ramírez-Martínez, Nina Pastor","doi":"10.1021/acs.jcim.4c02405","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02405","url":null,"abstract":"<p><p>Cre recombinase, a powerful tool for genome engineering, associates into an intasome, a tetrameric complex of alternate active and inactive monomers that bring together two <i>loxP</i> sequences, stabilized by key protein-protein and protein-DNA interactions. High-resolution structural information for free Cre is still missing, in contrast to the many structures found for Cre-DNA complexes in the Protein Data Bank, hindering understanding of the initial steps in intasome formation. To approach Cre structure and dynamics, we carried out 100 μs of molecular dynamics simulations of free Cre, starting from five Cre structures from different stages of intasome assembly. In the generated ensemble, the linker connecting the CBD and CAT domains is an intrinsically disordered region (IDR) that promotes different orientations of the two domains. The domains remain folded and interact with each other through short-lived interactions, retaining ∼70% of their surface available for interaction with <i>loxP</i>. The C-terminal Helix N in the CAT domain is also an IDR that interacts with the entire protein, including the active site, transiently forming an autoinhibited complex. The active site can be assembled in the absence of DNA, albeit inefficiently. The CAT domain has a clam-like motion, opening and closing the cavity where helix N docks, establishing protein-protein interactions in the intasome. Helix A in the CBD domain slides over the domain like a windshield wiper, sampling intasome-like conformations, among others. The wide range of intramolecular motion sampled by free Cre suggests that it uses conformational selection, using primed DNA-binding surfaces in both domains while assembling into the intasome.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143668483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi-Qi Chen, Tao Yu, Zheng-Qi Song, Chen-Yu Wang, Jiang-Tao Luo, Yong Xiao, Heng Qiu, Qing-Qing Wang, Hai-Ming Jin
{"title":"Application of Large Language Models in Drug-Induced Osteotoxicity Prediction.","authors":"Yi-Qi Chen, Tao Yu, Zheng-Qi Song, Chen-Yu Wang, Jiang-Tao Luo, Yong Xiao, Heng Qiu, Qing-Qing Wang, Hai-Ming Jin","doi":"10.1021/acs.jcim.5c00275","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00275","url":null,"abstract":"<p><p>Drug-induced osteotoxicity refers to the harmful effects certain drugs have on the skeletal system, posing significant safety risks. These toxic effects are a key concern in clinical practice, drug development, and environmental management. However, existing toxicity assessment models lack specialized data sets and algorithms for predicting osteotoxicity. In our study, we collected osteotoxic molecules and employed various large language models, including DeepSeek and ChatGPT, alongside traditional machine learning methods to predict their properties. Among these, the DeepSeek R1 and ChatGPT o3 models achieved ACC values of 0.87 and 0.88, respectively. Our results indicate that machine learning methods can assist in evaluating the impact of harmful substances on bone health during drug development, improving safety protocols, mitigating skeletal side effects, and enhancing treatment outcomes and public safety. Furthermore, it highlights the potential of large language models in predicting molecular toxicity and their significance in the fields of health and chemical sciences.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143668478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SFM-Net: Selective Fusion of Multiway Protein Feature Network for Predicting Binding Affinity Changes upon Mutations.","authors":"Chunting Liu, Sudong Cai, Tong Pan, Hiroyuki Ogata, Jiangning Song, Tatsuya Akutsu","doi":"10.1021/acs.jcim.5c00130","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00130","url":null,"abstract":"<p><p>Accurately predicting the effect of mutations on protein-protein interactions (PPIs) is essential for understanding the protein structure and function, as well as providing insights into disease-causing mechanisms. Many recent popular approaches based on the three-dimensional structure of proteins have been proposed to predict the changes in binding affinity caused by mutations, i.e. ΔΔ<i>G</i>. However, how to effectively use the structural information to comprehensively exploit complex interactions within proteins and integrate multisource features remains a significant challenge. In this study, we propose SFM-Net, a powerful deep learning model constructed with GNN-based multiway feature extractors and a new context-aware selective fusion module that jointly leverages the sequence, structural, and evolutionary information. Such design enables SFM-Net to effectively and selectively use features from different sources to facilitate binding affinity change prediction. Benchmarking experiments and targeted ablation studies illustrate the effectiveness and robustness of our method for improving the binding affinity change prediction.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143661732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rubens C Souza, Julio C Duarte, Ronaldo R Goldschmidt, Itamar Borges
{"title":"Predicting Fluorescence Emission Wavelengths and Quantum Yields via Machine Learning.","authors":"Rubens C Souza, Julio C Duarte, Ronaldo R Goldschmidt, Itamar Borges","doi":"10.1021/acs.jcim.4c02403","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02403","url":null,"abstract":"<p><p>The search for functional fluorescent organic materials can significantly benefit from the rapid and accurate predictions of photophysical properties. However, screening large numbers of potential fluorophore molecules in different solvents faces limitations of quantum mechanical calculations and experimental measurements. In this work, we develop machine learning (ML) algorithms for predicting the fluorescence of a molecule, focusing on two target properties: emission wavelengths (WLs) and quantum yields (QYs). For this purpose, we employ the Deep4Chem database which contains the optical properties of 20,236 combinations of 7,016 chromophores in 365 different solvents. Several chemical descriptors, or features, were selected as inputs for each model, and each molecule was characterized by its SMILES fingerprint. The Shapley additive explanations (SHAP) technique was used to rationalize the results, showing that the most impactful properties are chromophore-related, as expected from chemical intuition. For the best-performing model, the Random Forest, our results for the test set show a root-mean-square error (RMSE) of 28.8 nm (0.15 eV) for WLs and 0.19 for QYs. The developed ML models were used to predict, thus completing, the missing results for the WL and QY target properties in the original Deep4Chem database, resulting in two new databases: one for each property. Testing our ML models for each target property in molecules not included in the original Deep4Chem database gave good results.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143668482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging-Induced Polarization for Drug Discovery: Efficient IC50 Prediction Using Minimal Features.","authors":"Ashraf Mohamed, Bernard R Brooks, Muhamed Amin","doi":"10.1021/acs.jcim.5c00076","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c00076","url":null,"abstract":"<p><p>Here, we use the frequency of the atomic hybridizations (s, sp, sp<sup>2</sup>, and sp<sup>3</sup>) of each atom type (H, C, N, O, S, etc.) within a molecule to predict the IC50s of drug-like molecules, focusing on compounds targeting the Thrombin, Estrogen Receptor alpha, and Phosphodiesterase 5A proteins. The Neural Network and Random Forest models yield high correlation coefficients (<i>R</i><sup>2</sup>) and low mean square error (MSE) using only 19 features. The atomic hybridizations have been used previously to calculate the molecular polarizability using a simple empirical model (Miller et al. <i>JACS</i> <b>1979</b>). We show that the atomic hybridizations may also be used to accurately predict the molecular polarizabilities of these molecules. The results show the importance of the induced polarization in protein-ligand binding. Furthermore, the variation in <i>R</i><sup>2</sup> and MSE for the different target proteins indicates that the contribution of the induced polarization to the binding energies is different for different target proteins.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143668481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised Machine Learning-Based Image Recognition of Raw Infrared Spectra: Toward Chemist-like Chemical Structural Classification and Beyond Numerical Data.","authors":"Kentarou Fuku, Takefumi Yoshida","doi":"10.1021/acs.jcim.4c01644","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01644","url":null,"abstract":"<p><p>Recent advances in artificial intelligence have significantly improved spectral data analysis. In this study, we used unsupervised machine learning to classify chemical compounds based on infrared (IR) spectral images, without relying on prior chemical knowledge. The potential of machine learning for chemical classification was demonstrated by extracting IR spectral images from the Spectral Database for Organic Compounds and converting them into 208,620-dimensional vector data. Hierarchical clustering of 230 compounds revealed distinct main clusters (<b>A</b>-<b>G</b>), each with specific subclusters exhibiting higher intracluster similarities. Despite the challenges, including sensitivity to spectral deviations and difficulty of distinguishing delicate chemical structures in spectra with low transparency in the fingerprint area, the proposed image recognition approach exhibits good potential. Both principal component analysis and k-means clustering produced similar results. Furthermore, the method demonstrated high robustness to noise. The Tanimoto coefficient was used to evaluate the molecular similarity, providing valuable insights. However, some results deviated from chemists' intuitions. The study also highlighted that the scaling composition formulas and molecular weights did not affect the classification results because high-dimensional features dominated the process. A comparison of the clustering results obtained from molecular fingerprints, using the adjusted Rand index as a metric, indicated that the image data provided better classification performance than numerical data of the same resolution. Overall, this study demonstrates the feasibility of using machine learning with IR spectral image data for chemical classification and offers a novel perspective that complements traditional methods, although the classifications may not always align with chemists' intuitions. This approach has broader implications for fields such as drug discovery, materials science, and automated spectral analysis, where handling large, raw spectral data sets is essential.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143661745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mac Kevin E Braza, Özlem Demir, Surl-Hee Ahn, Clare K Morris, Carla Calvó-Tusell, Kelly L McGuire, Bárbara de la Peña Avalos, Michael A Carpenter, Yanjun Chen, Lorenzo Casalino, Hideki Aihara, Mark A Herzik, Reuben S Harris, Rommie E Amaro
{"title":"Regulatory Interactions between APOBEC3B N- and C-Terminal Domains.","authors":"Mac Kevin E Braza, Özlem Demir, Surl-Hee Ahn, Clare K Morris, Carla Calvó-Tusell, Kelly L McGuire, Bárbara de la Peña Avalos, Michael A Carpenter, Yanjun Chen, Lorenzo Casalino, Hideki Aihara, Mark A Herzik, Reuben S Harris, Rommie E Amaro","doi":"10.1021/acs.jcim.4c02272","DOIUrl":"10.1021/acs.jcim.4c02272","url":null,"abstract":"<p><p>APOBEC3B (A3B) is implicated in DNA mutations that facilitate tumor evolution. Although structures of its individual N- and C-terminal domains (NTD and CTD) have been resolved through X-ray crystallography, the full-length A3B (fl-A3B) structure remains elusive, limiting our understanding of its dynamics and mechanisms. In particular, the APOBEC3B C-terminal domain (A3Bctd) is frequently closed in models and structures. In this study, we built several new models of fl-A3B using integrative structural biology methods and selected a top model for further dynamical investigation. We compared the dynamics of the truncated (A3Bctd) to that of the fl-A3B via conventional and Gaussian accelerated molecular dynamics (MD) simulations. Subsequently, we employed weighted ensemble methods to explore the fl-A3B active site opening mechanism, finding that interactions at the NTD-CTD interface enhance the opening frequency of the fl-A3B active site. Our findings shed light on the structural dynamics and potential druggability of fl-A3B, including observations regarding both the active and allosteric sites, which may offer new avenues for therapeutic intervention in cancer.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengsong Wu, Yuanyuan Ren, Yang Li, Yue Cui, Liyao Zhang, Pan Zhang, Xuejiao Zhang, Shangguang Kan, Chan Zhang, Yuyan Xiong
{"title":"Identification and Experimental Validation of NETosis-Mediated Abdominal Aortic Aneurysm Gene Signature Using Multi-omics, Machine Learning, and Mendelian Randomization.","authors":"Chengsong Wu, Yuanyuan Ren, Yang Li, Yue Cui, Liyao Zhang, Pan Zhang, Xuejiao Zhang, Shangguang Kan, Chan Zhang, Yuyan Xiong","doi":"10.1021/acs.jcim.4c02318","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02318","url":null,"abstract":"<p><p>Abdominal aortic aneurysm (AAA) is a life-threatening disorder with limited therapeutic options. Neutrophil extracellular traps (NETs) are formed by a process known as \"NETosis\" that has been implicated in AAA pathogenesis, yet the roles and prognostic significance of NET-related genes in AAA remain poorly understood. This study aimed to identify key AAA- and NET-related genes (AAA-NETs-RGs), elucidate their potential mechanisms in contributing to AAA, and explore potential therapeutic compounds for AAA therapy. Through bioinformatics analysis of multiomics and machine learning, we identified six AAA-NETs-RGs: DUSP26, FCN1, MTHFD2, GPRC5C, SEMA4A, and CCR7, which exhibited strong diagnostic potential for predicting AAA progression, were significantly enriched in pathways related to cytokine-cytokine receptor interaction and chemokine signaling. Immune infiltration analysis revealed a causal association between AAA-NETs-RGs and immune cell infiltration. Cell-cell communication analysis indicated that AAA-NETs-RGs predominantly function in smooth muscle cells, B cells, T cells, and NK cells, primarily through cytokine and chemokine signaling. Gene profiling revealed that CCR7 and MTHFD2 exhibited the most significant upregulation in AAA patients compared to non-AAA controls, as well as in <i>in vitro</i> AAA models. Notably, genetic depletion of CCR7 and MTHFD2 strongly inhibited Ang II-induced phenotypic switching, functional impairment, and senescence in vascular smooth muscle cells (VSMCs). Based on AAA-NETs-RGs, molecular docking analysis combined with the Connectivity Map (CMap) database identified mirdametinib as a potential therapeutic agent for AAA. Mirdametinib effectively alleviated Ang II-induced phenotypic switching, biological dysfunction, and senescence. These findings provide valuable insights into understanding the pathophysiology of AAA and highlight promising therapeutic strategies targeting AAA-NETs-RGs.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}