{"title":"NeSyDPP-4: discovering DPP-4 inhibitors for diabetes treatment with a neuro-symbolic AI approach.","authors":"Delower Hossain, Ehsan Saghapour, Jake Y Chen","doi":"10.3389/fbinf.2025.1603133","DOIUrl":"10.3389/fbinf.2025.1603133","url":null,"abstract":"<p><strong>Introduction: </strong>Diabetes Mellitus (DM) constitutes a global epidemic and is one of the top ten leading causes of mortality (WHO, 2019), projected to rank seventh by 2030. The US National Diabetes Statistics Report (2021) states that 38.4 million Americans have diabetes. Dipeptidyl Peptidase-4 (DPP-4) is an FDA-approved target for the treatment of type 2 diabetes mellitus (T2DM). However, current DPP-4 inhibitors may cause adverse effects, including gastrointestinal issues, severe joint pain (FDA safety warning), nasopharyngitis, hypersensitivity, and nausea. Moreover, the development of novel drugs and the <i>in vivo</i> assessment of DPP-4 inhibition are both costly and often impractical. These challenges highlight the urgent need for efficient <i>in-silico</i> approaches to facilitate the discovery and optimization of safer and more effective DPP-4 inhibitors.</p><p><strong>Methodology: </strong>Quantitative Structure-Activity Relationship (QSAR) modeling is a widely used computational approach for evaluating the properties of chemical substances. In this study, we employed a Neuro-symbolic (NeSy) approach, specifically the Logic Tensor Network (LTN), to develop a DPP-4 QSAR model capable of identifying potential small-molecule inhibitors and predicting bioactivity classification. For comparison, we also implemented baseline models using Deep Neural Networks (DNNs) and Transformers. A total of 6,563 bioactivity records (SMILES-based compounds with IC<sub>50</sub> values) were collected from ChEMBL, PubChem, BindingDB, and GTP. Feature sets used for model training included descriptors (CDK Extended-PaDEL), fingerprints (Morgan), chemical language model embeddings (ChemBERTa-2), LLaMa 3.2 embedding features, and physicochemical properties.</p><p><strong>Results: </strong>Among all tested configurations, the Neuro-symbolic QSAR model (NeSyDPP-4) performed best using a combination of CDK extended and Morgan fingerprints. The model achieved an accuracy of 0.9725, an F1-score of 0.9723, an ROC AUC of 0.9719, and a Matthews correlation coefficient (MCC) of 0.9446. These results outperformed the baseline DNN and Transformer models, as well as existing state-of-the-art (SOTA) methods. To further validate the robustness of the model, we conducted an external evaluation using the Drug Target Common (DTC) dataset, where NeSyDPP-4 also demonstrated strong performance, with an accuracy of 0.9579, an AUC-ROC of 0.9565, a Matthews Correlation Coefficient (MCC) of 0.9171, and an F1-score of 0.9577.</p><p><strong>Discussion: </strong>These findings suggest that the NeSyDPP-4 model not only delivered high predictive performance but also demonstrated generalizability to external datasets. This approach presents a cost-effective and reliable alternative to traditional vivo screening, offering valuable support for the identification and classification of biologically active DPP-4 inhibitors in the treatment of type 2 diabetes mellitus (T2DM).</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1603133"},"PeriodicalIF":3.9,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12319772/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144786090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qiu Wang, Hong Yang, Fang Li, Song Ge, Ling Ji, Xiaofeng Li
{"title":"Analysis of histone modifications in key cellular subpopulations in the context of azoospermia using spermatogenic single-cell RNA-seq data.","authors":"Qiu Wang, Hong Yang, Fang Li, Song Ge, Ling Ji, Xiaofeng Li","doi":"10.3389/fbinf.2025.1626153","DOIUrl":"10.3389/fbinf.2025.1626153","url":null,"abstract":"<p><strong>Introduction: </strong>The molecular underpinnings of non-obstructive azoospermia (NOA), a severe form of male infertility characterized by the absence of sperm in the ejaculate, remain unclear.</p><p><strong>Methods: </strong>In this study, we demonstrate the role of histone modifications within specific testicular cell subpopulations in NOA using single-cell RNA sequencing (scRNA-seq) data.</p><p><strong>Results: </strong>Based on scRNA-seq analysis of the data acquired from the Gene Expression Omnibus (GSE149512), we identified nine distinct cell types and revealed significant compositional differences between the NOA and control testicular tissues. In contrast to the high prevalence of spermatogenic cells in the controls, endothelial, testicular interstitial, and vascular smooth muscle cells, as well as macrophages, were enriched in NOA. Furthermore, our analyses revealed considerable enrichment of histone modificationrelated genes in Leydig cells, peritubular myoid (PTM) cells, and macrophages in the NOA group. HDAC2, a pivotal regulator of histone acetylation, exhibited significant upregulation. Functional pathway analysis implicated these genes in critical biological processes, including nuclear transport, RNA splicing, and autophagy. We quantified the activity of histone modificationrelated genes using AUCell and identified distinct Leydig cell subpopulations characterized by unique marker genes and functional pathways, underscoring their dual roles in histone modification and spermatogenesis. Additionally, cellular communication analysis via CellChat demonstrated altered interaction dynamics across cell types in NOA, particularly in Leydig and PTM cells, which exhibited enhanced interactions alongside differential activation of the WNT and NOTCH signaling pathways.</p><p><strong>Discussion: </strong>These findings suggest that aberrant histone modifications in specific cellular subpopulations may drive disease progression, highlighting potential targets for diagnostic and therapeutic strategies. This study offers novel insights into the molecular mechanisms of NOA and provides a basis for future research on advanced male reproductive health.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1626153"},"PeriodicalIF":3.9,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12313672/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144777061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clara Shionyu-Mitusyama, Satoshi Ohmori, Subaru Hirata, Hirokazu Ishida, Tsuyoshi Shirai
{"title":"IDRdecoder: a machine learning approach for rational drug discovery toward intrinsically disordered regions.","authors":"Clara Shionyu-Mitusyama, Satoshi Ohmori, Subaru Hirata, Hirokazu Ishida, Tsuyoshi Shirai","doi":"10.3389/fbinf.2025.1627836","DOIUrl":"10.3389/fbinf.2025.1627836","url":null,"abstract":"<p><strong>Introduction: </strong>Intrinsically disordered regions (IDRs) of proteins have traditionally been overlooked as drug targets. However, with growing recognition of their crucial role in biological activity and their involvement in various diseases, IDRs have emerged as promising targets for drug discovery. Despite this potential, rational methodologies for IDR-targeted drug discovery remain underdeveloped, primarily due to a lack of reference experimental data.</p><p><strong>Methods: </strong>This study explores a machine learning approach to predict IDR functions, drug interaction sites, and interacting molecular substructures within IDR sequences. To address the data gap, stepwise transfer learning was employed. IDRdecoder sequentially generate predictions for IDR classification, interaction sites, and interacting ligand substructures. In the first step, the neural net was trained as autoencoder by using 26,480,862 predicted IDR sequences. Then it was trained against 57,692 ligand-binding PDB sequences with higher IDR tendency via transfer learning for predict ligand interacting sites and ligand types.</p><p><strong>Results: </strong>IDRdecoder was evaluated against 9 IDR sequences, which were experimentally detailed as drug targets. In the encoding space, specific GO terms related to the hypothesized functions of the evaluation IDR sequences were highly enriched. The model's prediction performance for drug interacting sites and ligand types demonstrated the area under the curve (AUC) of 0.616 and 0.702, respectively. The performance was compared with existing methods including ProteinBERT, and IDRdecoder demonstrated moderately improved performance.</p><p><strong>Discussion: </strong>IDRdecoder is the first application for predicting drug interaction sites and ligands in IDR sequences. Analysis of the prediction results revealed characteristics beneficial for IDR-drug design; for instance, Tyr and Ala are preferred target sites, while flexible substructures, such as alkyl groups, are favored in ligand molecules.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1627836"},"PeriodicalIF":3.9,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12313641/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144777062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shipra Jain, Ritu Tomer, Sumeet Patiyal, Gajendra P S Raghava
{"title":"NfκBin: a machine learning based method for screening TNF-α induced NF-κB inhibitors.","authors":"Shipra Jain, Ritu Tomer, Sumeet Patiyal, Gajendra P S Raghava","doi":"10.3389/fbinf.2025.1573744","DOIUrl":"10.3389/fbinf.2025.1573744","url":null,"abstract":"<p><strong>Introduction: </strong>Nuclear Factor kappa B (NF-κB) is a transcription factor whose upregulation is associated in chronic inflammatory diseases, including rheumatoid arthritis, inflammatory bowel disease, and asthma. In order to develop therapeutic strategies targeting NF-κB-related diseases, we developed a computational approach to predict drugs capable of inhibiting TNF-α induced NF-κB signaling pathways.</p><p><strong>Method: </strong>We utilized a dataset comprising 1,149 inhibitors and 1,332 non-inhibitors retrieved from PubChem. Chemical descriptors were computed using the PaDEL software, and relevant features were selected using advanced feature selection techniques.</p><p><strong>Result: </strong>Initially, machine learning models were constructed using 2D descriptors, 3D descriptors, and molecular fingerprints, achieving maximum AUC values of 0.66, 0.56, and 0.66, respectively. To improve feature selection, we applied univariate analysis and SVC-L1 regularization to identify features that can effectively differentiate inhibitors from non-inhibitors. Using these selected features, we developed machine learning models, our support vector classifier achieved a highest AUC of 0.75 on the validation dataset.</p><p><strong>Discussion: </strong>Finally, this best-performing model was employed to screen FDA-approved drugs for potential NF-κB inhibitors. Notably, most of the predicted inhibitors corresponded to drugs previously identified as inhibitors in experimental studies, underscoring the model's predictive reliability. Our best-performing models have been integrated into a standalone software and web server, NfκBin. (https://webs.iiitd.edu.in/raghava/nfkbin/).</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1573744"},"PeriodicalIF":3.9,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12310657/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144762552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yesid Aristizabal, Yamil Liscano, José Oñate-Garzón
{"title":"Understanding the selectivity <i>in silico</i> of colistin and daptomycin toward gram-negative and gram-positive bacteria, respectively, from the interaction with membrane phospholipids.","authors":"Yesid Aristizabal, Yamil Liscano, José Oñate-Garzón","doi":"10.3389/fbinf.2025.1569480","DOIUrl":"10.3389/fbinf.2025.1569480","url":null,"abstract":"<p><p>Antimicrobial resistance is a significant public health concern worldwide. Currently, infections by antibiotic-resistant Gram-negative and Gram-positive bacteria are managed using the lipopeptide antibiotics colistin and daptomycin, which target the microbial membrane. Despite the fact that both are short, cyclic, and have a common acylated group, they display remarkable antimicrobial selectivity. Colistin exhibits activity only against gram-negative bacteria, while daptomycin only against gram-positive bacteria. However, the mechanism behind this selectivity is unclear. Here, we performed molecular dynamics simulations to study the interactions between <i>Escherichia coli</i> membrane models composed of 1-Palmitoyl-2-Oleoyl-sn-Glycero-3-Phosphoethanolamine (POPE)/1-Palmitoyl-2-Oleoyl-sn-Glycero-3-Phosphoglycerol (POPG) with daptomycin and colistin, independently. Similarly, we simulated the interaction between the <i>Staphyloccocus aureus</i> model membrane composed of POPG and cardiolipin (PMCL1) with both antibiotics. We observed that colistin interacted via hydrogen bonds and electrostatic interactions with the polar head of POPE in <i>E. coli</i> membrane models, mediated by 2,4-diaminobutyric acid (DAB) residues, which facilitated the insertion of its acyl tail into the hydrophobic core of the bilayer. In <i>S. aureus</i> membrane models, weaker interactions were observed with the polar head, particularly POPG, which was insufficient for the insertion of the lipid tail into the membrane. However, daptomycin displayed strong interactions with several POPG functional groups of the <i>S. aureus</i> membrane model, which favored the insertion of the fatty acid tail into the bilayer. Contrastingly, daptomycin showed negligible interactions with the <i>E. coli</i> membrane, except for the amino group of the POPE polar head, which might repel the calcium ions conjugated with the lipopeptide. Based on these results, we identified key amino acid-phospholipid interactions that likely contribute to this antibacterial selectivity, which might contribute to designing and developing future antimicrobial peptides.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1569480"},"PeriodicalIF":3.9,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12310579/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144762553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chelsy Chesterman, Thomas Desautels, Luz-Jeannette Sierra, Kathryn T Arrildt, Adam Zemla, Edmond Y Lau, Shivshankar Sundaram, Jason Laliberte, Lynn Chen, Aaron Ruby, Mark Mednikov, Sylvie Bertholet, Dong Yu, Kate Luisi, Enrico Malito, Corey P Mallett, Matthew J Bottomley, Robert A van den Berg, Daniel Faissol
{"title":"Design of cross-reactive antigens with machine learning and high-throughput experimental evaluation.","authors":"Chelsy Chesterman, Thomas Desautels, Luz-Jeannette Sierra, Kathryn T Arrildt, Adam Zemla, Edmond Y Lau, Shivshankar Sundaram, Jason Laliberte, Lynn Chen, Aaron Ruby, Mark Mednikov, Sylvie Bertholet, Dong Yu, Kate Luisi, Enrico Malito, Corey P Mallett, Matthew J Bottomley, Robert A van den Berg, Daniel Faissol","doi":"10.3389/fbinf.2025.1580967","DOIUrl":"10.3389/fbinf.2025.1580967","url":null,"abstract":"<p><p>Selecting an optimal antigen is a crucial step in vaccine development, significantly influencing both the vaccine's effectiveness and the breadth of protection it provides. High antigen sequence variability, as seen in pathogens like rhinovirus, HIV, influenza virus, complicates the design of a single cross-protective antigen. Consequently, vaccination with a single antigen molecule often confers protection against only a single variant. In this study, machine learning methods were applied to the design of factor H binding protein (fHbp), an antigen from the bacterial pathogen <i>Neisseria meningitidis</i>. The vast number of potential antigen mutants presents a significant challenge for improving fHbp antigenicity. Moreover, limited data on antigen-antibody binding in public databases constrains the training of machine learning models. To address these challenges, we used computational models to predict fHbp properties and machine learning was applied to select both the most promising and informative mutants using a Gaussian process (GP) model. These mutants were experimentally evaluated to both confirm promising leads and refine the machine learning model for future iterations. In our current model, mutants were designed that enabled the transfer of fHbp v1.1 specific conformational epitopes onto fHbp v3.28, while maintaining binding to overlapping cross-reactive epitopes. The top mutant identified underwent biophysical and x-ray crystallographic characterization to confirm that the overall structure of fHbp was maintained throughout this epitope engineering experiment. The integrated strategy presented here could form the basis of a next-generation, iterative antigen design platform, potentially accelerating the development of new broadly protective vaccines.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1580967"},"PeriodicalIF":3.9,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12319226/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144786089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Franz Leonard Böge, Helena U Zacharias, Stefanie C Becker, Klaus Jung
{"title":"Using deep neural networks and LASSO regression to predict miRNA expression changes based on mRNA data.","authors":"Franz Leonard Böge, Helena U Zacharias, Stefanie C Becker, Klaus Jung","doi":"10.3389/fbinf.2025.1566162","DOIUrl":"10.3389/fbinf.2025.1566162","url":null,"abstract":"<p><strong>Introduction: </strong>Since the rise of molecular high-throughput technologies, many diseases are now studied on multiple omics layers in parallel. Understanding the interplay between microRNAs (miRNA) and their target mRNAs is important to understand the molecular level of diseases. While much public data from mRNA experiments are available for many diseases, few paired datasets with both miRNA and mRNA expression profiles are available. This study aimed to assess the possibility of predicting miRNA expression data based on mRNA expression data, serving as a proof of principle that such cross-omics predictions are feasible. Furthermore, current research relies on target databases where information about miRNA-target relationships is provided based on experimental and computational studies.</p><p><strong>Methods: </strong>To make use of publicly available mRNA profiles, we investigate the ability of artificial deep neural networks and linear least absolute shrinkage and selection operator (LASSO) regression to predict unknown miRNA expression profiles. We evaluate the approach using seven paired miRNA/mRNA expression datasets, four from studies on West Nile virus infection in mouse tissues and three from human immunodeficiency virus (HIV) infection in human tissues. We assessed the performance of each model first by within-data evaluations and second by cross-study evaluations. Furthermore, we investigated whether data augmentation or separate models for data from diseased and non-diseased samples can improve the prediction performance.</p><p><strong>Results: </strong>In general, most settings achieved strong correlations at the Level of individual samples. In some datasets and settings, correlations of log-fold changes and p-values from differential expression analysis (DEA) between true and predicted miRNA profiles can be observed. Correlation between log fold changes could also be seen in a cross-study evaluation for the HIV datasets. Data augmentation consistently improved performance in neural networks, while its impact on LASSO models was not significant.</p><p><strong>Discussion: </strong>Overall, cross-omics prediction of expression profiles appears possible, even with some correlations on the Level of the differential expression analysis.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1566162"},"PeriodicalIF":2.8,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12279838/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144692638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bing He, Teng Xu, Shaowei Xu, Huqiang Fang, Qingshan Yang
{"title":"Comparative transcriptome analysis of different tissues of <i>Hylomecon japonica</i> provides new insights into the biosynthesis pathway of triterpenoid saponins.","authors":"Bing He, Teng Xu, Shaowei Xu, Huqiang Fang, Qingshan Yang","doi":"10.3389/fbinf.2025.1625145","DOIUrl":"10.3389/fbinf.2025.1625145","url":null,"abstract":"<p><p>Triterpenoid saponins are one of the main activities of roots and rhizomes of <i>Hylomecon japonica</i>, with various pharmacological activities such as antibacterial, anticancer, and anti-inflammatory. To elucidate the biosynthesis pathway of triterpenoid saponins in <i>H. japonica</i>, DNA nanoball sequencing technology was used to analyze the transcriptome of leaves, roots, and stems of <i>H. japonica</i>. Out of a total of 99,404 unigenes, 78,989 unigenes were annotated by seven major databases; 49 unigenes encoded 11 key enzymes in the biosynthesis pathway of triterpenoid saponins. Nine transcription factors were found to be involved in the metabolism of terpenoids and polyketides in <i>H</i>. <i>japonica</i> and a spatial structure model of squalene synthase in triterpenoid saponin biosynthesis was established. This study greatly enriched the transcriptome data of <i>H. japonica</i>, which is helpful for further analysis of the functions and regulatory mechanisms of key enzymes in the biosynthesis pathway of triterpenoid saponins.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1625145"},"PeriodicalIF":2.8,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12277290/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144683728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ryan M Tobin, Shikha Singh, Sudhir Kumar, Sayaka Miura
{"title":"GenoPath: a pipeline to infer tumor clone composition, mutational history, and metastatic cell migration events from tumor DNA sequencing data.","authors":"Ryan M Tobin, Shikha Singh, Sudhir Kumar, Sayaka Miura","doi":"10.3389/fbinf.2025.1615834","DOIUrl":"10.3389/fbinf.2025.1615834","url":null,"abstract":"<p><p>DNA sequencing technologies are widely used to study tumor evolution within a cancer patient. However, analyses require various computational methods, including those to infer clone sequences (genotypes of cancer cell populations), clone frequencies within each tumor sample, clone phylogeny, mutational tree, dynamics of mutational signatures, and metastatic cell migration events. Therefore, we developed GenoPath, a streamlined pipeline of existing tools to perform tumor evolution analysis. We also developed and added tools to visualize results to assist interpretation and derive biological insights. We have illustrated GenoPath's utility through a case study of tumor evolution using metastatic prostate cancer data. By reducing computational barriers, GenoPath broadens access to tumor evolution analysis. The software is available at https://github.com/SayakaMiura/GP.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1615834"},"PeriodicalIF":2.8,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12263698/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144651403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sandro Gepiro Contaldo, Antonio d'Acierno, Lorenzo Bosio, Francesco Venice, Elisa Li Perottino, Janneth Estefania Hoyos Rea, Giovanna Cristina Varese, Francesca Cordero, Marco Beccuti
{"title":"Long-read microbial genome assembly, gene prediction and functional annotation: a service of the MIRRI ERIC Italian node.","authors":"Sandro Gepiro Contaldo, Antonio d'Acierno, Lorenzo Bosio, Francesco Venice, Elisa Li Perottino, Janneth Estefania Hoyos Rea, Giovanna Cristina Varese, Francesca Cordero, Marco Beccuti","doi":"10.3389/fbinf.2025.1632189","DOIUrl":"10.3389/fbinf.2025.1632189","url":null,"abstract":"<p><strong>Background: </strong>Understanding the structure and function of microbial genomes is crucial for uncovering their ecological roles, evolutionary trajectories, and potential applications in health, biotechnology, agriculture, food production, and environmental science. However, genome reconstruction and annotation remain computationally demanding and technically complex.</p><p><strong>Results: </strong>We introduce a bioinformatics platform designed explicitly for long-read microbial sequencing data to address these challenges. Developed as a service of the Italian MIRRI ERIC node, the platform provides a comprehensive solution for analyzing both prokaryotic and eukaryotic genomes, from assembly to functional protein annotation. It integrates state-of-the-art tools (e.g., Canu, Flye, BRAKER3, Prokka, InterProScan) within a reproducible, scalable workflow built on the Common Workflow Language and accelerated through high-performance computing infrastructure. A user-friendly web interface ensures accessibility, even for non-specialists.</p><p><strong>Conclusion: </strong>Through case studies involving three environmentally and clinically significant microorganisms, we demonstrate the ability of the platform to produce reliable, biologically meaningful insights, positioning it as a valuable tool for routine genome analysis and advanced microbial research.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1632189"},"PeriodicalIF":2.8,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12256462/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144638857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}