{"title":"Structural Flexibility and Shape Similarity Contribute to Exclusive Functions of Certain Atg8 Isoforms in the Autophagy Process.","authors":"Alexey Rayevsky, Eliah Bulgakov, Mariia Stykhylias, Sergey Ozheredov, Svetlana Spivak, Yaroslav Blume","doi":"10.1002/minf.70004","DOIUrl":"https://doi.org/10.1002/minf.70004","url":null,"abstract":"<p><p>Despite the abundance of systematically collected experimental data and facts, the multistep process of autophagy still contains many dark spots. One concerns the background selectivity of interactions between certain autophagy-related protein (ATG8) isoforms and their receptors/adaptors in plants during the autophagy process. By regulating phagophore initiation, expansion, and maturation, these proteins control the assembly of numerous autophagy proteins at this key docking platform. Bioinformatics analysis of human, yeast, and plant ATG8 amino acid sequences allow us to build a sequence tree of plant ATG8s, divided in three groups. We perform a structural study aimed at revealing some of the underlying reasons for the differences in the selectivity of ATG8 isoforms. A series of molecular dynamics (MD) simulations are performed to explain the stage-dependent functionality of ATG8. The conserved secondary structure and folding across all ATG8 proteins, resulting in nearly identical protein-protein interaction interfaces, makes this study particularly important and interesting. Recognizing the dual role of the LC3 interacting region (LIR) in autophagosome biogenesis and recruitment of the anchored selective autophagy receptor (SAR), we perform a mobility domain analysis. To this end, the amino acid sequence associated with the LIR docking site (LDS) interface is localized and subjected to root mean square deviation (RMSD)-based clustering analysis. Starting from Atg8-targeted protein-peptide docking, we attempt to identify conformational changes in the contact region of the corresponding adaptors and receptors involved in the common biogenesis events in autophagy. For the molecular dynamics, we select three representatives, sharing common patterns with other members of the groups. The resulting ATG8-peptide complexes display a significant preference for binding specific partners by different ATG8 isotypes.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 7","pages":"e202500025"},"PeriodicalIF":2.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144659700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emma Svensson, Emma Granqvist, Tomas Bastys, Christos Kannas, Mikhail Kabeshov, Samuel Genheden, Ola Engkvist, Thierry Kogej
{"title":"Network Analysis of the Organic Chemistry in Patents, Literature, and Pharmaceutical Industry.","authors":"Emma Svensson, Emma Granqvist, Tomas Bastys, Christos Kannas, Mikhail Kabeshov, Samuel Genheden, Ola Engkvist, Thierry Kogej","doi":"10.1002/minf.202500011","DOIUrl":"10.1002/minf.202500011","url":null,"abstract":"<p><p>Chemical reactions can be connected in large networks such as knowledge graphs. In this way, prior work has been able to draw meaningful conclusions about the properties and structures involved in organic chemistry reactions. However, the research has focused on public sources of organic synthesis that might lack the intricate details of the synthetic routes used in in-house drug discovery. In this work, previous analyses are expanded to also include an in-house electronic lab notebook (ELN) source, such that we can compare it to knowledge graphs that were constructed from US Patent and Trademark Office (USPTO) and Reaxys. We found that the Reaxys knowledge graph is the most interconnected and has the largest proportion of nodes belonging to the core, whereas the USPTO is much less connected and only has a small core. The ELN knowledge graph falls between these extremes in connectivity and it does not have any core. The hub molecules of ELN and USPTO are most similar, primarily represented by small, organic building blocks. We hypothesize that these differences can be attributed to the different origins of the data in the three sources. We discuss what impact this might have on synthesis prediction modelling.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 7","pages":"e202500011"},"PeriodicalIF":2.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12273192/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144659699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anton Cherednichenko, Sergii Afonin, Oleg Babii, Taras Voitsitskyi, Roman Stratiichuk, Ihor Koleiev, Volodymyr Vozniak, Nazar Shevchuk, Zakhar Ostrovsky, Semen Yesylevskyy, Alan Nafiiev, Serhii Starosyla, Anne S Ulrich, Aigars Jirgensons, Igor V Komarov
{"title":"Neural Network Models for Prediction of Biological Activity using Molecular Dynamics Data: A Case of Photoswitchable Peptides.","authors":"Anton Cherednichenko, Sergii Afonin, Oleg Babii, Taras Voitsitskyi, Roman Stratiichuk, Ihor Koleiev, Volodymyr Vozniak, Nazar Shevchuk, Zakhar Ostrovsky, Semen Yesylevskyy, Alan Nafiiev, Serhii Starosyla, Anne S Ulrich, Aigars Jirgensons, Igor V Komarov","doi":"10.1002/minf.70001","DOIUrl":"10.1002/minf.70001","url":null,"abstract":"<p><p>Prediction of biological activities of chemical compounds by the machine learning techniques in general and the neural networks (NNs) in particular, is usually based on the analysis of their binding to the target of interest. If such affinity data is not available, the ligand-based approaches can be used where the NN models are trained to assess similarity of compounds to those with known biological activity. Obviously, this approach only works well if the similarity between the training set and the evaluated molecules is sufficiently high. In the case of large and conformationally flexible organic compounds, the activity becomes dependent not only on chemical identity but also on the dynamics of molecular motions, which imposes significant challenges to existing approaches based on static structural 2D and 3D molecular descriptors. A prominent example of compounds, which are especially challenging for existing NN activity prediction techniques, are photoswitchable macrocyclic peptides containing a diarylethene \"photoswitch\" (DAE). These molecules exist in two isomeric forms with remarkably different biological activities, which are interconvertible by light of different wavelengths. Activity prediction models have to distinguish in this case not only between the different peptides but also between the photoisomers of the same peptide. In this work, we demonstrate that the features extracted from classical molecular dynamics (MD) trajectories are superior to conventional 2D or 3D descriptor-based features when used in activity prediction NN models of DAE-containing photoswitchable peptides. Using MD-derived features, we successfully created two NN models that predict activities of photoswitchable peptidomimetics, analogs of the natural peptidic antibiotic gramicidin S. The first model precisely predicts the cytotoxic activity of similar peptide analogs. The second model reliably predicts the differences in the biological activities of DAE photoisomers of the same peptide, even if the type of its activity differs from one in the training dataset. Our results demonstrate that accounting for MD-derived dynamic features allows generalizing the ligand-based activity prediction NN models to the cases of large and conformationally flexible molecules, which were previously considered intractable by this class of models.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 7","pages":"e70001"},"PeriodicalIF":2.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12257427/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144626740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rapid Assessment of Virtually Synthesizable Chemical Structures via Support Vector Machine Models.","authors":"Yuto Iwasaki, Tomoyuki Miyao","doi":"10.1002/minf.70000","DOIUrl":"10.1002/minf.70000","url":null,"abstract":"<p><p>Support vector machine (SVM) and support vector regression (SVR) are widely used for building quantitative structure-activity relationship models for small- and medium-sized datasets. Although SVM and SVR models can efficiently predict compound activity, evaluating billions of molecules remains challenging, which sometimes occurs when screening the virtual molecules derived through virtual synthesis. Herein, we present an SVM-/SVR-based method for screening virtually synthesizable molecules based on their reactants. The proposed method employs a combination of reactant-wise kernel functions for fast evaluation without sacrificing prediction accuracy. Tested on 120 small molecular activity datasets against 10 macromolecule targets, the proposed SVR models with data augmentation worked equally to standard SVR models with the Tanimoto kernel. As a demonstration, exhaustive 6.4 × 10<sup>12</sup> reactant combinations were evaluated by an SVR model within 8 days on a single desktop computer, enabling large-scale screening without sampling.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 7","pages":"e202500039"},"PeriodicalIF":2.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12278806/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144675311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vikas Yadav, Mohammad Kashif, Zenab Kamali, Samudrala Gourinath, Naidu Subbarao
{"title":"In Silico Identification of Novel and Potent Inhibitors Against Mutant BRAF (V600E), MD Simulations, Free Energy Calculations, and Experimental Determination of Binding Affinity.","authors":"Vikas Yadav, Mohammad Kashif, Zenab Kamali, Samudrala Gourinath, Naidu Subbarao","doi":"10.1002/minf.202400372","DOIUrl":"https://doi.org/10.1002/minf.202400372","url":null,"abstract":"<p><p>BRAF is a proto oncogene that functions as a key signal transducer in the MAPK-ERK pathway, which regulates cell growth, division, and survival. Mutations in BRAF, particularly the V600E substitution in its kinase domain, are major drivers in melanoma and several other metastatic cancers, including breast, colorectal, NSCLC, and gastrointestinal cancers. In this study, novel inhibitors targeting the BRAF(V600E) mutant using a structure-based drug design approach are identified. Four chemical libraries ChemDiv Kinase, ChemDiv Anticancer, NCI, and ChEMBL Kinase SARfari are screened. Compounds from the ChemDiv Anticancer database show better Glide scores comparable to the FDA-approved BRAF inhibitor Vemurafenib. The compounds P184-1419 and P184-1479 score -12.688 and -12.012 kcal/mol, respectively, versus -14.288 kcal/mol for Vemurafenib. Top hits are further validated using GOLD docking, X-score ranking, and interaction profiling via LigPlot. Molecular dynamics simulations, principal component analysis, and free energy calculations confirm the stability of protein-ligand complexes. Biolayer interferometry assays reveal P184-1419 exhibits stronger binding affinity (KD = 151 μM) than Vemurafenib (KD = 437 μM). These findings suggest P184-1419 is a promising lead compound against BRAF(V600E), offering potential for future development of more effective cancer therapies.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2400372"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144310149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Drug Search and Design Considering Cell Specificity of Chemically Induced Gene Expression Profiles for Disease-Associated Tissues.","authors":"Chikashige Yamanaka, Michio Iwata, Kazuma Kaitoh, Yoshihiro Yamanishi","doi":"10.1002/minf.2444","DOIUrl":"10.1002/minf.2444","url":null,"abstract":"<p><p>The use of omics data, including gene expression profiles, has recently gained increasing attention in drug discovery. Omics-based drug searches and designs are often based on the correlations between chemically induced and disease-induced gene expression profiles; however, the cell specificity has not been considered. In this study, we designed a novel computational method for drug search and design using cell-specific correlations between drugs and diseases. A data completion technique allowed the characterization of cell-specific gene expression patterns in diseased cells. This proposed method was applied to search for drug candidates and generate new chemical structures for gastric cancer and atopic dermatitis. The results of drug search demonstrated that compounds with diverse chemical structures were detected and were associated with target diseases at the molecular pathway levels. The results of drug design also demonstrated that newly generated compounds were reasonable in terms of the reproducibility of registered drugs. The proposed method is expected to be useful for omics-based drug discovery.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2444"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12188700/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144485147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Valeria Scardino, M Justina Galarce, M Emilia Mignone, Claudio N Cavasotto
{"title":"Enhancing the Reliability of Integrated Consensus Strategies to Boost Docking-Based Screening Campaigns Using Publicly Available Docking Programs.","authors":"Valeria Scardino, M Justina Galarce, M Emilia Mignone, Claudio N Cavasotto","doi":"10.1002/minf.2445","DOIUrl":"https://doi.org/10.1002/minf.2445","url":null,"abstract":"<p><p>The use of docking-based virtual screening is today an established critical component within the drug discovery pipeline. In the context where the performance of molecular docking has been found to depend on the protein target and the program, consensus docking has been found to be a valuable approach to enhance the performance of high-throughput docking (HTD). We present and evaluate an integrated pose and ranking consensus approach that combines the advantages of pose consensus and the exponential consensus ranking (ECR) approach, using only publicly available docking programs (rDock, DOCK 6, Auto Dock 4, PLANTS, and Vina). Based on a thorough analysis performed to assess the optimal combination of matching poses and ECR thresholds, using a benchmarking set of 50 protein targets of diverse families and different property-matched ligand/decoy libraries, this enhanced pose/ranking consensus approach displayed a notably superior performance than the individual docking programs, and the ECR. This approach was also evaluated in HTD campaigns using larger libraries (∼1.1 million molecules) on six targets, thus obtaining an average improvement of the ECR of about 40%. We thus may say that this pose/ranking consensus methodology can be confidently used in prospective HTD campaigns using free-available docking programs.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2445"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144333535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spherical GTM: A New Proposition for Visualization of Chemical Data.","authors":"Farah Asgarkhanova, Marcou Gilles, Mikhail Volkov, Murielle Muzard, Richard Plantier-Royon, Caroline Rémond, Dragos Horvath, Alexandre Varnek","doi":"10.1002/minf.202500045","DOIUrl":"10.1002/minf.202500045","url":null,"abstract":"<p><p>The Spherical Generative Topographic Mapping (SGTM) method represents an intuitive approach to visualize chemical data. Unlike the original Generative Topographic Mapping algorithm, which utilizes a bounded flat Euclidean space as a manifold, our proposed modification introduces a spherical manifold to address known nonflat topology issues. In this study, we describe the mathematical formalism of this new approach and showcase its ability to visualize 2D electron density patterns of water and benzene and the CosMoPoly chemical library-an enumeration of synthetically accessible molecules. By comparing the outcomes with established references, it is demonstrated that SGTM emerges as a novel 3D data visualization method, offering improved accuracy in the depiction of chemical structures.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2500045"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12186103/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144302555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Development of Machine Learning-Based Models for Mutagenicity Predictions with Applications to Non-Sugar Sweeteners.","authors":"Shilpayan Ghosh, Vinay Kumar, Kunal Roy","doi":"10.1002/minf.202400357","DOIUrl":"https://doi.org/10.1002/minf.202400357","url":null,"abstract":"<p><p>Artificial sweeteners, often known as non-sugar sweeteners (NSSs), have been utilized as food additives since World War II. However, there is also concern regarding the mutagenicity potential of NSSs. Every new chemical registration in the food and pharmaceutical industries requires an evaluation of its mutagenic potential, which is essential for food safety. Most of the studies focus solely on determining the mutagenicity of NSSs through in vivo trials, which may be troublesome in terms of the time and cost required for experimental evaluation. To avoid the associated complexities concerning experimentation, a new approach methodology by developing machine learning (ML) models for mutagenicity predictions and selecting the best models by a stringent cross-validation analysis is explored. Two random splits (50/50) of a dataset of 6881 organic compounds for model development are used. Consensus predictions are provided for the mutagenic potential of an external set of 332 NSSs using six selected models (three best ML models based on cross-validation using either data splitting strategy) through voting and considering the applicability domain using two different approaches. In addition, to check the reliability of predictions, the model-derived consensus predictions have also been compared to the predictions generated by the k-nearest neighbor method using the virtual models for property evaluation of chemicals within a global architecture platform and the consensus method generated in the toxicity estimation software tool platform. Finally, based on the analysis, six compounds could be prioritized as mutagenic NSSs in this investigation. The developed models have been made available from https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home/mutagenicity-predictor.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2400357"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144302554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiaying You, Jane Foo, Nada Lallous, Artem Cherkasov
{"title":"Deep Modeling of Gain-of-Function Mutations on Androgen Receptor.","authors":"Jiaying You, Jane Foo, Nada Lallous, Artem Cherkasov","doi":"10.1002/minf.202500018","DOIUrl":"https://doi.org/10.1002/minf.202500018","url":null,"abstract":"<p><p>The efficiency of Androgen Receptor (AR) pathway inhibitors for prostate cancer (PCa) is on decline due to resistance mechanisms including the occurrence of gain-of-function mutations on human androgen receptor (AR). Hence, understanding and predicting such mutations is crucial for developing effective PCa treatment strategies. Leveraging accu- mulated data on clinically relevant AR mutants with recent advances in deep modeling techniques, this study aims to unveil and quantify critical AR mutation-drug relation- ships. By incorporating molecular descriptors for drugs and mutated genes sequences, this work represented these features as single vectors and demonstrates their effective- ness in modeling AR mutant responses to conventional antiandrogens. The developed approach achieves above 80% accuracy in predicting the gain-of-function behavior of AR mutants and therefore can potentially uncover unknown agonist/antagonist relationships among mutant-drug pairs.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 4","pages":"e202500018"},"PeriodicalIF":2.8,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144035853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}