Lucas A. Garro, Matias F. Andrada, Esteban G. Vega-Hissi, Sonia Barberis, Juan C. Garro Martinez
{"title":"Development of QSARs for cysteine-containing di- and tripeptides with antioxidant activity:influence of the cysteine position","authors":"Lucas A. Garro, Matias F. Andrada, Esteban G. Vega-Hissi, Sonia Barberis, Juan C. Garro Martinez","doi":"10.1007/s10822-024-00567-z","DOIUrl":"10.1007/s10822-024-00567-z","url":null,"abstract":"<div><p>Antioxidants agents play an essential role in the food industry for improving the oxidative stability of food products. In the last years, the search for new natural antioxidants has increased due to the potential high toxicity of chemical additives. Therefore, the synthesis and evaluation of the antioxidant activity in peptides is a field of current research. In this study, we performed a Quantitative Structure Activity Relationship analysis (QSAR) of cysteine-containing 19 dipeptides and 19 tripeptides. The main objective is to bring information on the relationship between the structure of peptides and their antioxidant activity. For this purpose, 1D and 2D molecular descriptors were calculated using the PaDEL software, which provides information about the structure, shape, size, charge, polarity, solubility and other aspects of the compounds. Different QSAR model for di- and tripeptides were developed. The statistic parameters for di-peptides model (R<sup>2</sup><sub>train</sub> = 0.947 and R<sup>2</sup><sub>test</sub> = 0.804) and for tripeptide models (R<sup>2</sup><sub>train</sub> = 0.923 and R<sup>2</sup><sub>test</sub> = 0.847) indicate that the generated models have high predictive capacity. Then, the influence of the cysteine position was analyzed predicting the antioxidant activity for new di- and tripeptides, and comparing them with glutathione. In dipeptides, excepting SC, TC and VC, the activity increases when cysteine is at the N-terminal position. For tripeptides, we observed a notable increase in activity when cysteine is placed in the N-terminal position.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141873885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laura Guasch, Niels Maeder, John G. Cumming, Christian Kramer
{"title":"From mundane to surprising nonadditivity: drivers and impact on ML models","authors":"Laura Guasch, Niels Maeder, John G. Cumming, Christian Kramer","doi":"10.1007/s10822-024-00566-0","DOIUrl":"10.1007/s10822-024-00566-0","url":null,"abstract":"<div><p>Nonadditivity (NA) in Structure-Activity and Structure-Property Relationship (SAR) data is a rare but very information rich phenomenon. It can indicate conformational flexibility, structural rearrangements, and errors in assay results and structural assignment. While purely ligand-based conformational causes of NA are rather well understood and mundane, other factors are less so and cause surprising NA that has a huge influence on SAR analysis and ML model performance. We here report a systematic analysis across a wide range of properties (20 on-target biological activities and 4 physicochemical ADME-related properties) to understand the frequency of various different phenomena that may lead to NA. A set of novel descriptors were developed to characterize double transformation cycles and identify trends in NA. Double transformation cycles were classified into “surprising” and “mundane” categories, with the majority being classed as mundane. We also examined commonalities among surprising cycles, finding LogP differences to have the most significant impact on NA. A distinct behavior of NA for on-target sets compared to ADME sets was observed. Finally, we show that machine learning models struggle with highly nonadditive data, indicating that a better understanding of NA is an important future research direction.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141756486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander C. Brueckner, Benjamin Shields, Palani Kirubakaran, Alexander Suponya, Manoranjan Panda, Shana L. Posy, Stephen Johnson, Sirish Kaushik Lakkaraju
{"title":"MDFit: automated molecular simulations workflow enables high throughput assessment of ligands-protein dynamics","authors":"Alexander C. Brueckner, Benjamin Shields, Palani Kirubakaran, Alexander Suponya, Manoranjan Panda, Shana L. Posy, Stephen Johnson, Sirish Kaushik Lakkaraju","doi":"10.1007/s10822-024-00564-2","DOIUrl":"10.1007/s10822-024-00564-2","url":null,"abstract":"<div><p>Molecular dynamics (MD) simulation is a powerful tool for characterizing ligand–protein conformational dynamics and offers significant advantages over docking and other rigid structure-based computational methods. However, setting up, running, and analyzing MD simulations continues to be a multi-step process making it cumbersome to assess a library of ligands in a protein binding pocket using MD. We present an automated workflow that streamlines setting up, running, and analyzing Desmond MD simulations for protein–ligand complexes using machine learning (ML) models. The workflow takes a library of pre-docked ligands and a prepared protein structure as input, sets up and runs MD with each protein–ligand complex, and generates simulation fingerprints for each ligand. Simulation fingerprints (SimFP) capture protein–ligand compatibility, including stability of different ligand-pocket interactions and other useful metrics that enable easy rank-ordering of the ligand library for pocket optimization. SimFPs from a ligand library are used to build & deploy ML models that predict binding assay outcomes and automatically infer important interactions. Unlike relative free-energy methods that are constrained to assess ligands with high chemical similarity, ML models based on SimFPs can accommodate diverse ligand sets. We present two case studies on how SimFP helps delineate structure–activity relationship (SAR) trends and explain potency differences across matched-molecular pairs of (1) cyclic peptides targeting PD-L1 and (2) small molecule inhibitors targeting CDK9.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10822-024-00564-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141625626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wen-Chieh Huang, Chia-Hung Hsu, Titus V. Albu, Chia-Ning Yang
{"title":"Structural impacts of two disease-linked ADAR1 mutants: a molecular dynamics study","authors":"Wen-Chieh Huang, Chia-Hung Hsu, Titus V. Albu, Chia-Ning Yang","doi":"10.1007/s10822-024-00565-1","DOIUrl":"10.1007/s10822-024-00565-1","url":null,"abstract":"<div><p>Adenosine deaminases acting on RNA (ADARs) are pivotal RNA-editing enzymes responsible for converting adenosine to inosine within double-stranded RNA (dsRNA). Dysregulation of ADAR1 editing activity, often arising from genetic mutations, has been linked to elevated interferon levels and the onset of autoinflammatory diseases. However, understanding the molecular underpinnings of this dysregulation is impeded by the lack of an experimentally determined structure for the ADAR1 deaminase domain. In this computational study, we utilized homology modeling and the AlphaFold2 to construct structural models of the ADAR1 deaminase domain in wild-type and two pathogenic variants, R892H and Y1112F, to decipher the structural impact on the reduced deaminase activity. Our findings illuminate the critical role of structural complementarity between the ADAR1 deaminase domain and dsRNA in enzyme-substrate recognition. That is, the relative position of E1008 and K1120 must be maintained so that they can insert into the minor and major grooves of the substrate dsRNA, respectively, facilitating the flipping-out of adenosine to be accommodated within a cavity surrounding E912. Both amino acid replacements studied, R892H at the orthosteric site and Y1112F at the allosteric site, alter K1120 position and ultimately hinder substrate RNA binding.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141625551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konrad Diedrich, Christiane Ehrt, Joel Graef, Martin Poppinga, Norbert Ritter, Matthias Rarey
{"title":"User-centric design of a 3D search interface for protein-ligand complexes","authors":"Konrad Diedrich, Christiane Ehrt, Joel Graef, Martin Poppinga, Norbert Ritter, Matthias Rarey","doi":"10.1007/s10822-024-00563-3","DOIUrl":"10.1007/s10822-024-00563-3","url":null,"abstract":"<div><p>In this work, we present the frontend of GeoMine and showcase its application, focusing on the new features of its latest version. GeoMine is a search engine for ligand-bound and predicted empty binding sites in the Protein Data Bank. In addition to its basic text-based search functionalities, GeoMine offers a geometric query type for searching binding sites with a specific relative spatial arrangement of chemical features such as heavy atoms and intermolecular interactions. In contrast to a text search that requires simple and easy-to-formulate user input, a 3D input is more complex, and its specification can be challenging for users. GeoMine’s new version aims to address this issue from the graphical user interface perspective by introducing an additional visualization concept and a new query template type. In its latest version, GeoMine extends its query-building capabilities primarily through input formulation in 2D. The 2D editor is fully synchronized with GeoMine’s 3D editor and provides the same functionality. It enables template-free query generation and template-based query selection directly in 2D pose diagrams. In addition, the query generation with the 3D editor now supports predicted empty binding sites for AlphaFold structures as query templates. GeoMine is freely accessible on the Proteins<i>Plus</i> web server (https://proteins.plus).</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11139749/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141173926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Robert X. Song, Marc C. Nicklaus, Nadya I. Tarasova
{"title":"Correlation of protein binding pocket properties with hits’ chemistries used in generation of ultra-large virtual libraries","authors":"Robert X. Song, Marc C. Nicklaus, Nadya I. Tarasova","doi":"10.1007/s10822-024-00562-4","DOIUrl":"10.1007/s10822-024-00562-4","url":null,"abstract":"<div><p>Although the size of virtual libraries of synthesizable compounds is growing rapidly, we are still enumerating only tiny fractions of the drug-like chemical universe. Our capability to mine these newly generated libraries also lags their growth. That is why fragment-based approaches that utilize on-demand virtual combinatorial libraries are gaining popularity in drug discovery. These <i>à la carte</i> libraries utilize synthetic blocks found to be effective binders in parts of target protein pockets and a variety of reliable chemistries to connect them. There is, however, no data on the potential impact of the chemistries used for making on-demand libraries on the hit rates during virtual screening. There are also no rules to guide in the selection of these synthetic methods for production of custom libraries. We have used the SAVI (Synthetically Accessible Virtual Inventory) library, constructed using 53 reliable reaction types (transforms), to evaluate the impact of these chemistries on docking hit rates for 40 well-characterized protein pockets. The data shows that the virtual hit rates differ significantly for different chemistries with cross coupling reactions such as Sonogashira, Suzuki–Miyaura, Hiyama and Liebeskind–Srogl coupling producing the highest hit rates. Virtual hit rates appear to depend not only on the property of the formed chemical bond but also on the diversity of available building blocks and the scope of the reaction. The data identifies reactions that deserve wider use through increasing the number of corresponding building blocks and suggests the reactions that are more effective for pockets with certain physical and hydrogen bond-forming properties.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11098933/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140943273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aaron D. Danilack, Callum J. Dickson, Cihan Soylu, Mike Fortunato, Stephane Rodde, Hagen Munkler, Viktor Hornak, Jose S. Duca
{"title":"Reactivities of acrylamide warheads toward cysteine targets: a QM/ML approach to covalent inhibitor design","authors":"Aaron D. Danilack, Callum J. Dickson, Cihan Soylu, Mike Fortunato, Stephane Rodde, Hagen Munkler, Viktor Hornak, Jose S. Duca","doi":"10.1007/s10822-024-00560-6","DOIUrl":"10.1007/s10822-024-00560-6","url":null,"abstract":"<div><p>Covalent inhibition offers many advantages over non-covalent inhibition, but covalent warhead reactivity must be carefully balanced to maintain potency while avoiding unwanted side effects. While warhead reactivities are commonly measured with assays, a computational model to predict warhead reactivities could be useful for several aspects of the covalent inhibitor design process. Studies have shown correlations between covalent warhead reactivities and quantum mechanic (QM) properties that describe important aspects of the covalent reaction mechanism. However, the models from these studies are often linear regression equations and can have limitations associated with their usage. Applications of machine learning (ML) models to predict covalent warhead reactivities with QM descriptors are not extensively seen in the literature. This study uses QM descriptors, calculated at different levels of theory, to train ML models to predict reactivities of covalent acrylamide warheads. The QM/ML models are compared with linear regression models built upon the same QM descriptors and with ML models trained on structure-based features like Morgan fingerprints and RDKit descriptors. Experiments show that the QM/ML models outperform the linear regression models and the structure-based ML models, and literature test sets demonstrate the power of the QM/ML models to predict reactivities of unseen acrylamide warhead scaffolds. Ultimately, these QM/ML models are effective, computationally feasible tools that can expedite the design of new covalent inhibitors.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140836933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"De novo drug design as GPT language modeling: large chemistry models with supervised and reinforcement learning","authors":"Gavin Ye","doi":"10.1007/s10822-024-00559-z","DOIUrl":"10.1007/s10822-024-00559-z","url":null,"abstract":"<div><p>In recent years, generative machine learning algorithms have been successful in designing innovative drug-like molecules. SMILES is a sequence-like language used in most effective drug design models. Due to data’s sequential structure, models such as recurrent neural networks and transformers can design pharmacological compounds with optimized efficacy. Large language models have advanced recently, but their implications on drug design have not yet been explored. Although one study successfully pre-trained a <i>large chemistry model</i> (LCM), its application to specific tasks in drug discovery is unknown. In this study, the drug design task is modeled as a causal language modeling problem. Thus, the procedure of reward modeling, supervised fine-tuning, and proximal policy optimization was used to transfer the LCM to drug design, similar to Open AI’s ChatGPT and InstructGPT procedures. By combining the SMILES sequence with chemical descriptors, the novel efficacy evaluation model exceeded its performance compared to previous studies. After proximal policy optimization, the drug design model generated molecules with 99.2% having efficacy pIC<sub>50</sub> > 7 towards the amyloid precursor protein, with 100% of the generated molecules being valid and novel. This demonstrated the applicability of LCMs in drug discovery, with benefits including less data consumption while fine-tuning. The applicability of LCMs to drug discovery opens the door for larger studies involving reinforcement-learning with human feedback, where chemists provide feedback to LCMs and generate higher-quality molecules. LCMs’ ability to design similar molecules from datasets paves the way for more accessible, non-patented alternatives to drug molecules.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10822-024-00559-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140673020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ann E. Cleves, Ajay N. Jain, David A. Demeter, Zachary A. Buchan, Jeremy Wilmot, Erin N. Hancock
{"title":"From UK-2A to florylpicoxamid: Active learning to identify a mimic of a macrocyclic natural product","authors":"Ann E. Cleves, Ajay N. Jain, David A. Demeter, Zachary A. Buchan, Jeremy Wilmot, Erin N. Hancock","doi":"10.1007/s10822-024-00555-3","DOIUrl":"10.1007/s10822-024-00555-3","url":null,"abstract":"<div><p>Scaffold replacement as part of an optimization process that requires maintenance of potency, desirable biodistribution, metabolic stability, and considerations of synthesis at very large scale is a complex challenge. Here, we consider a set of over 1000 time-stamped compounds, beginning with a macrocyclic natural-product lead and ending with a broad-spectrum crop anti-fungal. We demonstrate the application of the QuanSA 3D-QSAR method employing an active learning procedure that combines two types of molecular selection. The first identifies compounds predicted to be most active of those most likely to be well-covered by the model. The second identifies compounds predicted to be most <i>informative</i> based on exhibiting low predicted activity but showing high 3D similarity to a highly active nearest-neighbor training molecule. Beginning with just 100 compounds, using a deterministic and automatic procedure, five rounds of 20-compound selection and model refinement identifies the binding metabolic form of florylpicoxamid. We show how iterative refinement broadens the domain of applicability of the successive models while also enhancing predictive accuracy. We also demonstrate how a simple method requiring very sparse data can be used to generate relevant ideas for synthetic candidates.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10822-024-00555-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140614475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Javier Vázquez, Ricardo García, Paula Llinares, F. Javier Luque, Enric Herrero
{"title":"On the relevance of query definition in the performance of 3D ligand-based virtual screening","authors":"Javier Vázquez, Ricardo García, Paula Llinares, F. Javier Luque, Enric Herrero","doi":"10.1007/s10822-024-00561-5","DOIUrl":"10.1007/s10822-024-00561-5","url":null,"abstract":"<div><p>Ligand-based virtual screening (LBVS) methods are widely used to explore the vast chemical space in the search of novel compounds resorting to a variety of properties encoded in 1D, 2D or 3D descriptors. The success of 3D-LBVS is affected by the overlay of molecular pairs, thus making selection of the template compound, search of accessible conformational space and choice of the query conformation to be potential factors that modulate the successful retrieval of actives. This study examines the impact of adopting different choices for the query conformation of the template, paying also attention to the influence exerted by the structural similarity between templates and actives. The analysis is performed using PharmScreen, a 3D LBVS tool that relies on similarity measurements of the hydrophobic/philic pattern of molecules, and Phase Shape, which is based on the alignment of atom triplets followed by refinement of the volume overlap. The study is performed for the original DUD-E<sup>+</sup> database and a Morgan Fingerprint filtered version (denoted DUD-E<sup>+</sup>-Diverse; available in https://github.com/Pharmacelera/Query-models-to-3DLBVS), which was prepared to minimize the 2D resemblance between template and actives. Although in most cases the query conformation exhibits a mild influence on the overall performance, a critical analysis is made to disclose factors, such as the content of structural features between template and actives and the induction of conformational strain in the template, that underlie the drastic impact of the query definition in the recovery of actives for certain targets. The findings of this research also provide valuable guidance for assisting the selection of the query definition in 3D LBVS campaigns.</p><h3>Graphical Abstract</h3><div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10822-024-00561-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}