{"title":"Molecular Modeling and Chemoinformatics in Ukraine.","authors":"Dmytro M Volochnyuk, Serhiy V Ryabukhin","doi":"10.1002/minf.70034","DOIUrl":"https://doi.org/10.1002/minf.70034","url":null,"abstract":"<p><p>The special issue collects recent contributions from Ukrainian researchers, both from academia and industry, in the field of chemoinformatics. It contains 6 publications from leading Ukrainian scientists in the field. These articles representatively demonstrated a wide landscape of chemoinformatics in Ukraine and its deep integration into the global one.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"45 5","pages":"e70034"},"PeriodicalIF":3.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147776815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tariq Sha'ban, Ahmad M Mustafa, Mostafa Z Ali, Talal Z Ali
{"title":"ADME-DTI: Augmented Deep Meta Ensemble for Drug-Target Interaction Prediction.","authors":"Tariq Sha'ban, Ahmad M Mustafa, Mostafa Z Ali, Talal Z Ali","doi":"10.1002/minf.70033","DOIUrl":"https://doi.org/10.1002/minf.70033","url":null,"abstract":"<p><p>Drug-target interaction represents a critical focus area in computational drug discovery and pharmaceutical research. However, the process of identifying and analyzing these interactions is often resource-intensive, requiring extensive experimentation to evaluate the binding relationships between numerous drugs and their respective targets. This complexity is further compounded by the fact that a drug can inhibit multiple targets, and a target may also bind to various drugs. To address these issues, advanced deep learning models have been introduced as promising tools, offering the ability to accurately predict binding affinity and other bioactivity values to distinguish between potential drug-target interactions. The proposed model, named the Augmented Deep Meta Ensemble for Drug-Target Interaction (ADME-DTI), leverages multiple descriptors and fingerprint representations to extract meaningful insights from drug and protein data. These submodels from each representation are then combined with a deep learning architecture along with the metadata of the drug and target entries. The proposed approach has proved competitive performance against state-of-the-art models across diverse datasets and evaluation metrics, as evidenced by key metrics, namely <math> <semantics> <mrow><msubsup><mi>r</mi> <mi>m</mi> <mn>2</mn></msubsup> </mrow> <annotation>$r_m^2$</annotation></semantics> </math> , concordance index, and mean squared error. Specifically, the model achieved mean squared error values of 0.186 (Davis), 0.118 (Kiba), 0.134 (DTC), 0.262 (Metz), 0.300 (ToxCast), and 0.791 (STITCH), all of which are publicly available benchmark datasets commonly used in drug-target interaction prediction tasks. The results highlight the model's effectiveness in reducing prediction errors and improving accuracy. Previous drug-target interaction research often relied on limited, non-diverse datasets, reducing generalizability. Few studies addressed drug-target interaction prediction as a regression task. Our novel deep learning metamodel integrates multiple models and representations, surpassing benchmarks and delivering greater prediction accuracy across varied datasets and evaluation metrics.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"45 5","pages":"e70033"},"PeriodicalIF":3.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147776820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ChemBang: Expanding the Chemical Space Around Small Molecules.","authors":"Diana Montes-Grajales, Luca Menestrina, Ricard Garcia-Serna, Jordi Mestres","doi":"10.1002/minf.70036","DOIUrl":"10.1002/minf.70036","url":null,"abstract":"<p><p>Efficient exploration of chemical space is an essential component of modern generative drug design. Herein, we introduce ChemBang, a computational engine that grows small molecules based on chemical transformations extracted by matched molecular pair analysis of all structures available in catalogues of synthesized molecules. Each chemical transformation is mapped onto its associated atomic environment defined as the substructure within a three-atom radius from the transformation site. Unsupervised chemical evolution is then performed in cycles by systematically applying chemical transformations to all exposed atomic environments present in a seed structure. Multiple physicochemical properties and substructural alerts are incorporated to effectively guide the generation of drug-like synthetically accessible molecules. As a use case, the generation of the Erdafitinib structure from any of its three ring systems (pyrazole, benzene and quinoxaline), and the evolution of the property distributions from all molecules generated in each cycle, are discussed in detail. The ability to explore the chemical space of pharmaceutical relevance is shown by successfully generating the exact chemical structure of 95.3% of all 2,809 small-molecule ATC drugs from their constituting fragments.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"45 5","pages":"e70036"},"PeriodicalIF":3.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13129508/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147776843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carlos D Ramírez-Márquez, Edgar López-López, José L Medina-Franco
{"title":"Constellation Plots in KNIME: An Automated Scaffold-Based Workflow for Interactive Chemical Space Visualization.","authors":"Carlos D Ramírez-Márquez, Edgar López-López, José L Medina-Franco","doi":"10.1002/minf.70035","DOIUrl":"10.1002/minf.70035","url":null,"abstract":"<p><p>Chemical space analysis is extensively used in different chemistry areas, ranging from the study of natural products to drug discovery projects. Its versatility stems from the ability to integrate continuous properties with molecular representations. This data is used to generate visualizations through dimensionality reduction algorithms. Constellation Plots have been proposed as a general approach to the visual representation of chemical space by encoding structural similarity, scaffold contents, frequency, and continuous properties into a single coordinate-based map. Thus, Constellation Plots provide a high-density visual representation of the chemical space of compound datasets with complex relations. Despite the versatility of Constellation Plots, there remains a significant lack of intuitive, user-friendly, or low-code protocols to automate the generation of these plots for non-computational experts. Herein, we present an interactive and automated scaffold-based Constellation Plot workflow developed within the open-source platform KNIME, facilitating chemical space visualization and analysis. To illustrate the application of the workflow, we used a dataset of 5,211 compounds that inhibit Tau protein, a key therapeutic target for Alzheimer's disease. The KNIME workflow is a general resource that can be used to analyze virtually any data set annotated with a property, including biological activity. The workflow is freely available at: https://github.com/Daniphantom99/KNIME_Constellation_plots.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"45 5","pages":"e70035"},"PeriodicalIF":3.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13129643/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147776757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Attention-Driven Graph Transformer With Nonlinear Modeling and Neuro-Fuzzy Fusion for High-Order Toxic Molecular Graph Learning.","authors":"Phu Pham","doi":"10.1002/minf.70037","DOIUrl":"https://doi.org/10.1002/minf.70037","url":null,"abstract":"<p><p>Learning expressive representations for complex, size-varied molecular graphs remains a fundamental challenge in toxic molecular property prediction and regression. The inherent nonlinearity of atomic interactions, together with intricate structural dependencies and latent high-order structural characteristics, makes accurate graph-based learning particularly difficult. Although recent deep learning (DL)/graph neural network (GNN) approaches have leveraged message passing, attention mechanisms, and graph transformer architectures, these techniques still suffer from limited nonlinear expressiveness, insufficient modeling of high-order structural information, and a lack of explicit uncertainty handling. To address these limitations, we propose a novel AKAGTL model, which is an attention-driven Kolmogorov-Arnold network (KAN)-based graph transformer framework for toxic molecular graph embedding and regression learning. Unlike existing approaches that rely on linear or shallow nonlinear transformations, our proposed AKAGTL model introduces a structured KAN-based functional transformation to explicitly model complex high-order atomic interactions within an attention-driven graph transformer backbone. In addition, high-order structural representations are systematically incorporated to complement structural encoding, while a Gaussian neuro-fuzzy fusion mechanism enables uncertainty-aware aggregation of heterogeneous feature spaces. The proposed framework is evaluated on multiple molecular toxicity benchmarks under a unified experimental protocol with repeated runs. Comprehensive experiments within benchmark molecular graph datasets demonstrate that our AKAGTL model can consistently improve regression accuracy compared to representative GNN and graph transformer baselines. These findings suggest that jointly modeling nonlinear functional interactions, structural dependencies, and uncertainty-aware fusion provides a more expressive and robust solution for toxic molecular graph embedding and regression learning.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"45 5","pages":"e70037"},"PeriodicalIF":3.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147856686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine Learning Models Predicting Solubility and Polymerizability of Polyimides Considering Multiple Monomers for CO<sub>2</sub> Separation Membranes.","authors":"Yuto Shino, Motosuke Katayama, Yuri Ito, Hiromasa Kaneko","doi":"10.1002/minf.70032","DOIUrl":"https://doi.org/10.1002/minf.70032","url":null,"abstract":"<p><p>Membrane technologies for the separation of gases, such as CO<sub>2</sub>/CH<sub>4</sub> mixtures, have attracted attention because of their high energy efficiency. Polyimides are considered promising membrane materials for CO<sub>2</sub> separation, and there is a growing demand for materials with even higher performance. In the screening of candidate materials, it is essential to consider not only separation performance but also solubility and polymerizability during the synthesis process. Low solubility or polymerizability can inhibit membrane fabrication and the evaluation of separation performance, potentially leading to wasted resources and effort. In this study, we developed machine learning models to predict the solubility and polymerizability of polyimides. Mixture features derived from molecular descriptors of multiple monomers and mixing ratios were used as inputs for the classification models. The models were then applied to novel candidates, and their effectiveness was validated experimentally.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"45 4","pages":"e70032"},"PeriodicalIF":3.1,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13101753/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147776717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiaying You, Hazem Mslati, Evan F Haney, Noushin Akhoundsadegh, Robert E W Hancock, Artem Cherkasov
{"title":"The Use of DeepQSAR Models for the Discovery of Peptides With Enhanced Antimicrobial and Antibiofilm Potential.","authors":"Jiaying You, Hazem Mslati, Evan F Haney, Noushin Akhoundsadegh, Robert E W Hancock, Artem Cherkasov","doi":"10.1002/minf.70029","DOIUrl":"10.1002/minf.70029","url":null,"abstract":"<p><p>Increasing concerns regarding prolonged antibiotic usage have spurred the search for alternative treatments. Antimicrobial peptides (AMPs), first discovered in the 1980s, have exhibited significant potential against a broad range of bacteria. Short-sequenced AMPs are abundant in nature and present across various organisms. Recently, machine learning technologies such as Quantitative Structure Activity Relationships (QSAR) have enabled expedited discovery of potential AMPs with broad-spectrum antibacterial activity as the amount of available AMP training data increases. Among those, Deep QSAR has recently emerged as a distinct type of application that utilizes conventional molecular descriptors in conjunction with more powerful deep learning (DL) models. Here, we demonstrate the power of Deep QSAR in predicting broad-spectrum AMP activity. Using a recurrent neural network-based QSAR model, we achieved nearly 90% fivefold cross-validated accuracy in classifying AMP activity. Using the developed approach, we designed 98 novel peptides, of which 36 experimentally demonstrated more effective antibiofilm activity and 26 peptides exhibited stronger antimicrobial activity compared to a well-characterized host defense peptide IDR-1018, which was demonstrated to possess broad spectrum antibiofilm activity against a wide range of bacterial pathogens and a previous computer-aided peptide design study employing IDR-1018 derivatives successfully identified novel peptides with enhanced antibiofilm activity. Notably, 22 of those peptides demonstrated improvements of both antimicrobial and, particularly, antibiofilm properties, making them suitable prototypes for preclinical development and demonstrating efficacy of DeepQSAR modeling in identifying novel and potent AMPs.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"45 4","pages":"e70029"},"PeriodicalIF":3.1,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13087548/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147699232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Farah Asgarkhanova, Shamkhal Baybekov, Gilles Marcou, Catherine Champmartin, Lisa Chedik, Frédéric Cosnier, Mikhail Volkov, Louis Plyer, Dragos Horvath, Alexandre Varnek
{"title":"Current Insights on Skin Permeability Data and Quantitative Structure-Property Relationship Modeling.","authors":"Farah Asgarkhanova, Shamkhal Baybekov, Gilles Marcou, Catherine Champmartin, Lisa Chedik, Frédéric Cosnier, Mikhail Volkov, Louis Plyer, Dragos Horvath, Alexandre Varnek","doi":"10.1002/minf.70030","DOIUrl":"https://doi.org/10.1002/minf.70030","url":null,"abstract":"<p><p>Skin permeability is a critical factor in pharmaceuticals, cosmetics, and occupational safety. Experimental determination of skin permeability coefficients (K<sub>p</sub>) is time-intensive and resource-intensive, highlighting the importance of computational predictions. This study presents a quantitative structure-property relationship (QSPR) model developed using the recently published SkinPiX dataset and the HuskinDB skin permeability database, comprising 209 curated compounds with associated K<sub>p</sub> values and metadata. The model performance was assessed on three new experimental data points. The datasets and models are freely available, providing a valuable tool to enhance decision-making and support the development of safer and more effective products.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"45 4","pages":"e70030"},"PeriodicalIF":3.1,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147776710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LapGAT: A Semi-Supervised Learning Framework for Drug-Target Interaction Prediction.","authors":"Lianjun Song, Wei Yuan, Xinyu Pei","doi":"10.1002/minf.70028","DOIUrl":"10.1002/minf.70028","url":null,"abstract":"<p><p>Drug-target interaction (DTI) prediction is a fundamental task in the field of drug discovery, with direct implications for the identification of novel therapeutic candidates and the repositioning of existing drugs. However, the practical application of DTI prediction remains hindered by several persistent challenges, including data scarcity, the absence of reliable negative samples, and limited model generalization across diverse biological contexts. Addressing these limitations is crucial for developing robust and generalizable predictive frameworks. In this study, we present LapGAT, a semi-supervised framework combining graph-enhanced Laplacian regularized least squares (LapRLS) with a graph attention network (GAT) to address these issues. In the upstream stage, LapRLS fuses multiple drug-target similarity matrices, applies Laplacian regularization, and selects top- and bottom-scoring pairs as high-confidence positive and negative samples. In the downstream stage, a multilayer GAT learns from these pseudo-labeled interactions, capturing both local graph structures and nonlinear dependencies. We validate LapGAT on four target categories (enzymes, G-protein-coupled receptors (GPCRs), ion channels, nuclear receptors), demonstrating robust performance in computational and experimental validation. Molecular docking (AutoDock Vina) confirms the physical plausibility of top-ranked predictions, with binding affinities ranging from -4.5 to -7.3 kcal/mol. Literature-based validation achieves accuracies of 81.3% (enzymes), 100% (GPCRs), 71.4% (ion channels), and 88.9% (nuclear receptors). This work offers a scalable and flexible computational tool for accelerating drug discovery efforts, with the potential for broad applicability across various therapeutic domains.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"45 4","pages":"e70028"},"PeriodicalIF":3.1,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147729467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Foundation and Multimodal Models for Drug Discovery in Molecular Informatics: Principles, Evaluation, and Practical Guidance.","authors":"Emmanuel Pio Pastore, Francesco De Rango","doi":"10.1002/minf.70027","DOIUrl":"10.1002/minf.70027","url":null,"abstract":"<p><p>Foundation and multimodal models are rapidly becoming a core methodology in molecular informatics, particularly for drug discovery, by leveraging large-scale pretraining across sequences, graphs, 3D structures, and text. This mini-review provides practical guidance on when these models help, how to choose representations and data, and how to design pretraining and adaptation pipelines for real-world use. We clarify what qualifies as a foundation model in chemistry; compare chemical language models, graph-based architectures, and 3D equivariant networks; review multimodal strategies that connect molecules with proteins, pockets, and natural language; and summarize diffusion-based generative modeling. We also emphasize rigorous evaluation, discussing realistic splitting protocols, distribution shift, activity cliffs, uncertainty calibration, and conformal prediction in the context of widely used benchmarks.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"45 3","pages":"e70027"},"PeriodicalIF":3.1,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13014059/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147513461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}