{"title":"ReSort enhances reference-based cell type deconvolution for spatial transcriptomics through regional information integration.","authors":"Linhua Wang, Ling Wu, Guantong Qi, Chaozhong Liu, Wanli Wang, Xiang H-F Zhang, Zhandong Liu","doi":"10.1093/bioadv/vbaf091","DOIUrl":"10.1093/bioadv/vbaf091","url":null,"abstract":"<p><strong>Motivation: </strong>Spatial transcriptomics (ST) captures positional gene expression within tissues but lacks single-cell resolution. Reference-based cell type deconvolution methods were developed to understand cell type distributions for ST. However, batch/platform discrepancies between references and ST impact their accuracy.</p><p><strong>Results: </strong>We present Region-based Cell Sorting (ReSort), which utilizes ST's region-level data to lessen reliance on reference data and alleviate these technical issues. In simulation studies, ReSort enhances reference-based deconvolution methods. Applying ReSort to a mouse breast cancer model highlights macrophages M0 and M2 enrichment in the epithelial clone, revealing insights into epithelial-mesenchymal transition and immune infiltration.</p><p><strong>Availability and implementation: </strong>Source codes for ReSort are publicly available at (https://github.com/LiuzLab/RESORT), implemented in Python.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf091"},"PeriodicalIF":2.4,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12161990/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144287401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2025-05-24eCollection Date: 2025-01-01DOI: 10.1093/bioadv/vbaf121
Diego Halac, Cecilia Cocucci, Sebastian Camerlingo
{"title":"Predictive machine learning model for 30-day hospital readmissions in a tertiary healthcare setting.","authors":"Diego Halac, Cecilia Cocucci, Sebastian Camerlingo","doi":"10.1093/bioadv/vbaf121","DOIUrl":"10.1093/bioadv/vbaf121","url":null,"abstract":"<p><strong>Motivation: </strong>Hospital readmissions represent a major challenge for healthcare systems due to their impact on patient outcomes and associated costs. As many readmissions are considered preventable, predictive modeling offers a valuable tool for early identification and intervention. This study aimed to develop and validate a predictive model for 30-day readmissions in a 200-bed community hospital in Argentina. A retrospective analysis was conducted on 3388 adult admissions. The primary endpoint was readmission within 30 days of discharge. Predictor variables included demographic and clinical factors such as age, length of stay, hypertension, diabetes, heart failure, coronary artery disease, stroke, cancer, dementia, chronic kidney disease, chronic obstructive pulmonary disease, and bedridden status. Three models-Logistic Regression (LR), Random Forest (RF), and LightGBM (LGBM)-were developed, with hyperparameter tuning via Bayesian optimization. Model performance was assessed using calibration, discrimination (C-statistics), and decision curve analysis. Internal validation was performed using 250 bootstrap resamples.</p><p><strong>Results: </strong>The readmission rate was 11% (<i>n</i> = 394). RF outperformed LR and LGBM in discrimination and clinical utility within predictive probability thresholds of 0.05-0.25. Optimism-corrected C-statistics were 0.60 (LR, LGBM) and 0.64 (RF); calibration slopes were 0.75 (LR), 1.13 (RF), and 1.76 (LGBM). Machine learning models, particularly RF, can improve readmission risk prediction and inform targeted healthcare interventions.</p><p><strong>Availability and implementation: </strong>The dataset and code used to develop and validate the predictive models are available from the corresponding author upon reasonable request. The implementation was conducted in R using the mlr3verse, pminternal, rms, dcurves, data.table, tidyverse, ranger and lightgbm packages, with Bayesian hyperparameter optimization via mlr3mbo.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf121"},"PeriodicalIF":2.4,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12158157/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"xOmicsShiny: an R Shiny application for cross-omics data analysis and pathway mapping.","authors":"Benbo Gao, Yu H Sun, Xinmin Zhang, Tinchi Lin, Wei Li, Romi Admanit, Baohong Zhang","doi":"10.1093/bioadv/vbaf097","DOIUrl":"10.1093/bioadv/vbaf097","url":null,"abstract":"<p><strong>Summary: </strong>We developed xOmicsShiny, a feature-rich R Shiny-powered application that enables biologists to fully explore omics datasets across experiments and data types, with an emphasis on uncovering biological insights at the pathway level. The data merging feature ensures flexible exploration of cross-omics data, such as transcriptomics, proteomics, metabolomics, and lipidomics. The pathway mapping function covers a broad range of databases, including WikiPathways, Reactome, and KEGG pathways. In addition, xOmicsShiny offers several visualization options and analytical tasks for everyday omics data analysis, namely, PCA, Volcano plot, Venn Diagram, Heatmap, WGCNA, and advanced clustering analyses. The application employs customizable modules to perform various tasks, generating both interactive plots and publication-ready figures. This dynamic, modular design overcomes the issue of slow loading in R Shiny tools and allows it to be readily expanded by the research and developer community.</p><p><strong>Availability and implementation: </strong>The R Shiny application is publicly available at: http://xOmicsShiny.bxgenomics.com. Researchers can upload their own data to the server or use the preloaded demo dataset. The source code, under MIT license, is provided at https://github.com/interactivereport/xOmicsShiny for local installation. A full tutorial of the application is available at https://interactivereport.github.io/xOmicsShiny/tutorial/docs/index.html.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf097"},"PeriodicalIF":2.4,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12117368/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144176068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2025-05-24eCollection Date: 2025-01-01DOI: 10.1093/bioadv/vbaf125
Zahra Elhamraoui, Eva Borràs, Mathias Wilhelm, Eduard Sabidó
{"title":"MSCI: an open-source Python package for information content assessment of peptide fragmentation spectra.","authors":"Zahra Elhamraoui, Eva Borràs, Mathias Wilhelm, Eduard Sabidó","doi":"10.1093/bioadv/vbaf125","DOIUrl":"10.1093/bioadv/vbaf125","url":null,"abstract":"<p><strong>Motivation: </strong>In mass spectrometry-based proteomics, the availability of peptide prior knowledge has improved our ability to assign fragmentation spectra to specific peptide sequences. However, some peptides exhibit similar analytical values and fragmentation patterns, which makes them nearly indistinguishable with current data analysis tools.</p><p><strong>Results: </strong>Here we developed the Mass Spectrometry Content Information (MSCI) Python package to tackle the challenges of peptide identification in mass spectrometry-based proteomics, particularly regarding indistinguishable peptides. MSCI provides a comprehensive toolset that streamlines the workflow from data import to spectral analysis, enabling researchers to effectively evaluate fragmentation similarity scores among peptide sequences and pinpoint indistinguishable peptide pairs in a given proteome.</p><p><strong>Availability and implementation: </strong>MSCI is implemented in Python and it is released under a permissive MIT license. The source code and the installers are available on GitHub at https://github.com/proteomicsunitcrg/MSCI.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf125"},"PeriodicalIF":2.4,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12204179/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144531371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2025-05-23eCollection Date: 2025-01-01DOI: 10.1093/bioadv/vbaf122
Fábio Madeira, Joonheung Lee, Nandana Madhusoodanan, Alberto Eusebi, Ania Niewielska, Sarah Butcher
{"title":"jdispatcher-viewers: interactive visualizations of sequence similarity search results and domain predictions.","authors":"Fábio Madeira, Joonheung Lee, Nandana Madhusoodanan, Alberto Eusebi, Ania Niewielska, Sarah Butcher","doi":"10.1093/bioadv/vbaf122","DOIUrl":"10.1093/bioadv/vbaf122","url":null,"abstract":"<p><strong>Motivation: </strong>Biological visualization is an important technique for researchers to make sense of complex biological data. Functional prediction and the discovery of novel proteins remain central objectives in biology, as they provide insights into molecular mechanisms with significant applications in health and disease. Visualizing sequence similarity search results and domain predictions is essential for exploring protein function, identifying conserved elements, and drawing meaningful connections between sequences, ultimately accelerating discovery.</p><p><strong>Results: </strong>The new website for the EMBL-EBI Job Dispatcher bioinformatics tools framework, was released in 2023. Along with improvements and new features, the website has since integrated interactive visualizations designed to aid researchers further and enrich the user experience. Here, we describe jdispatcher-viewers, a library for the interactive visualization of sequence similarity search results from BLAST and FASTA, and interactive visualizations of domain predictions and annotations provided by InterPro.</p><p><strong>Availability and implementation: </strong>The jdispatcher-viewers library and documentation which includes a demo webpage are available from https://github.com/ebi-jdispatcher/jdispatcher-viewers. Interactive visualizations provided among the result pages of sequence similarity search tools in Job Dispatcher have been implemented using jdispatcher-viewers, and are available at https://www.ebi.ac.uk/jdispatcher/sss. The library is distributed under the Apache 2.0 license.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf122"},"PeriodicalIF":2.4,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12133269/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144217682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2025-05-23eCollection Date: 2025-01-01DOI: 10.1093/bioadv/vbaf124
Nikita Janakarajan, Mara Graziani, María Rodríguez Martínez
{"title":"Phenotype driven data augmentation methods for transcriptomic data.","authors":"Nikita Janakarajan, Mara Graziani, María Rodríguez Martínez","doi":"10.1093/bioadv/vbaf124","DOIUrl":"10.1093/bioadv/vbaf124","url":null,"abstract":"<p><strong>Summary: </strong>The application of machine learning methods to biomedical applications has seen many successes. However, working with transcriptomic data on supervised learning tasks is challenging due to its high dimensionality, low patient numbers, and class imbalances. Machine learning models tend to overfit these data and do not generalize well on out-of-distribution samples. Data augmentation strategies help alleviate this by introducing synthetic data points and acting as regularizers. However, existing approaches are either computationally intensive, require population parametric estimates, or generate insufficiently diverse samples. To address these challenges, we introduce two classes of phenotype-driven data augmentation approaches-signature-dependent and signature-independent. The signature-dependent methods assume the existence of distinct gene signatures describing some phenotype and are simple, non-parametric, and novel data augmentation methods. The signature-independent methods are a modification of the established Gamma-Poisson and Poisson sampling methods for gene expression data. As case studies, we apply our augmentation methods to transcriptomic data of colorectal and breast cancer. Through discriminative and generative experiments with external validation, we show that our methods improve patient stratification by <math><mrow><mn>5</mn> <mo>-</mo> <mn>15</mn> <mi>%</mi></mrow> </math> over other augmentation methods in their respective cases. The study additionally provides insights into the limited benefits of over-augmenting data.</p><p><strong>Availability and implementation: </strong>Code for reproducibility is available on GitHub.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf124"},"PeriodicalIF":2.4,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12141816/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2025-05-23eCollection Date: 2025-01-01DOI: 10.1093/bioadv/vbaf089
Martin Proks, Jose Alejandro Romero Herrera, Jakub Sedzinski, Joshua M Brickman
{"title":"nf-core/marsseq: systematic preprocessing pipeline for MARS-seq experiments.","authors":"Martin Proks, Jose Alejandro Romero Herrera, Jakub Sedzinski, Joshua M Brickman","doi":"10.1093/bioadv/vbaf089","DOIUrl":"10.1093/bioadv/vbaf089","url":null,"abstract":"<p><strong>Motivation: </strong>Single sequencing technology (scRNA-seq) enables the study of gene regulation at a single cell level. Although many sc-RNA-seq protocols have been established, they have varied in technical complexity, sequencing depth and multimodal capabilities leading to shared limitations in data interpretation due to a lack of standardized preprocessing and consistent data reproducibility. While plate based techniques such as Massively Parallel RNA Single cell Sequencing (MARS-seq2.0) provide reference data on the cells that will be sequenced, the data format limits the possible analysis. Here, we focus on the standardization of MARS-seq analysis and its applicability to RNA velocity.</p><p><strong>Results: </strong>We have taken the original MARS-seq2.0 pipeline and revised it to enable implementation using the nf-core framework. By doing so, we have simplified pipeline execution, enabling a streamlined application with increased transparency and scalability. We have incorporated additional checkpoints to verify experimental metadata and improved the pipeline by implementing a custom workflow for RNA velocity estimation. The pipeline is part of the nf-core bioinformatics community and is freely available at https://github.com/nfcore/marsseq with data analysis at https://github.com/brickmanlab/proks-et-al-2023.</p><p><strong>Availability and implementation: </strong>We introduce an updated preprocessing pipeline for MARS-seq experiments following state-of-the-art guidelines for scientific software development with the added ability to infer RNA velocity.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf089"},"PeriodicalIF":2.4,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12117365/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144176067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2025-05-21eCollection Date: 2025-01-01DOI: 10.1093/bioadv/vbaf100
Bishoy Wadie, Martijn R Molenaar, Lucas M Vieira, Theodore Alexandrov
{"title":"Enrichment analysis for spatial and single-cell metabolomics accounting for molecular ambiguity.","authors":"Bishoy Wadie, Martijn R Molenaar, Lucas M Vieira, Theodore Alexandrov","doi":"10.1093/bioadv/vbaf100","DOIUrl":"10.1093/bioadv/vbaf100","url":null,"abstract":"<p><strong>Summary: </strong>Imaging mass spectrometry (imaging MS) has advanced spatial and single-cell metabolomics, but the reliance on MS1 data complicates the accurate identification of molecular structures, not being able to resolve isomeric and isobar molecules. This prevents application of conventional methods for overrepresentation analysis (ORA) and metabolite set enrichment analysis (MSEA). To address this, we introduce <i>S2IsoMEr</i> R package and a web app for METASPACE, which uses bootstrapping to propagate isomeric/isobaric ambiguities into the enrichment analysis. We demonstrate <i>S2IsoMEr</i> for single-cell metabolomics and the METASPACE web app for spatial metabolomics.</p><p><strong>Availability and implementation: </strong>METASPACE web app can be used on existing and new datasets submitted to METASPACE (https://metaspace2020.org). The source code for the <i>S2IsoMEr</i> R package is available on GitHub (https://github.com/alexandrovteam/S2IsoMEr).</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf100"},"PeriodicalIF":2.4,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12158160/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2025-05-20eCollection Date: 2025-01-01DOI: 10.1093/bioadv/vbaf118
Angela K Jiang, Jerry Zhao, Xiaofang Jiang
{"title":"EzSEA: an interactive web interface for enzyme sequence evolution analysis.","authors":"Angela K Jiang, Jerry Zhao, Xiaofang Jiang","doi":"10.1093/bioadv/vbaf118","DOIUrl":"10.1093/bioadv/vbaf118","url":null,"abstract":"<p><strong>Motivation: </strong>Enzymes catalyze essential chemical reactions, driving metabolism, immunity, and growth. Understanding their evolution requires identifying mutations that shaped their functions and substrate interactions. Current methods lack integration of evolutionary history and intuitive visualization tools.</p><p><strong>Results: </strong>We develop Enzyme Sequence Evolution Analysis (EzSEA), a web interface that identifies putative functionally important mutations by performing the following steps: structural prediction, homology search, multiple sequence alignment and trimming, phylogenetic tree inference, ancestral sequence reconstruction, and enzyme delineation. The EzSEA web application enables intuitive visualization of results, highlighting key mutations and phylogenetic tree branches that putatively delineate the enzyme of interest. Finally, we validate EzSEA by identifying previously experimentally verified key mutations in the gut bacteria enzyme bilirubin reductase.</p><p><strong>Availability and implementation: </strong>EzSEA is freely available on the web at https://jianglabnlm.com/ezsea/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf118"},"PeriodicalIF":2.4,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12179385/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144478033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2025-05-19eCollection Date: 2025-01-01DOI: 10.1093/bioadv/vbaf113
Alexandre Oliveira, Jorge Ferreira, Vítor Vieira, Bruno Sá, Miguel Rocha
{"title":"<i>TROPPO</i>: tissue-specific reconstruction and phenotype prediction using omics data.","authors":"Alexandre Oliveira, Jorge Ferreira, Vítor Vieira, Bruno Sá, Miguel Rocha","doi":"10.1093/bioadv/vbaf113","DOIUrl":"10.1093/bioadv/vbaf113","url":null,"abstract":"<p><strong>Summary: </strong>The increasing availability of high-throughput technologies in systems biology has advanced predictive tools like genome-scale metabolic models. Despite this progress, integrating omics data to create accurate, context-specific metabolic models for different tissues or cells remains challenging. A significant issue is that many existing tools rely on proprietary software, which limits accessibility. We introduce TROPPO, an open-source Python library designed to overcome these challenges. TROPPO supports a wide range of context-specific reconstruction algorithms, provides validation methods for assessing generated models, and includes gap-filling algorithms to ensure model consistency, integrating well with other constraint-based tools.</p><p><strong>Availability and implementation: </strong>TROPPO is implemented in Python and is freely available at https://github.com/BioSystemsUM/TROPPO and https://pypi.org/project/TROPPO/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf113"},"PeriodicalIF":2.4,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12179386/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144478032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}