Bioinformatics advancesPub Date : 2024-06-21eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae095
Kiran Deol, Griffin M Weber, Yun William Yu
{"title":"SlowMoMan: a web app for discovery of important features along user-drawn trajectories in 2D embeddings.","authors":"Kiran Deol, Griffin M Weber, Yun William Yu","doi":"10.1093/bioadv/vbae095","DOIUrl":"10.1093/bioadv/vbae095","url":null,"abstract":"<p><strong>Motivation: </strong>Nonlinear low-dimensional embeddings allow humans to visualize high-dimensional data, as is often seen in bioinformatics, where datasets may have tens of thousands of dimensions. However, relating the axes of a nonlinear embedding to the original dimensions is a nontrivial problem. In particular, humans may identify patterns or interesting subsections in the embedding, but cannot easily identify what those patterns correspond to in the original data.</p><p><strong>Results: </strong>Thus, we present SlowMoMan (SLOW Motions on MANifolds), a web application which allows the user to draw a one-dimensional path onto a 2D embedding. Then, by back-projecting the manifold to the original, high-dimensional space, we sort the original features such that those most discriminative along the manifold are ranked highly. We show a number of pertinent use cases for our tool, including trajectory inference, spatial transcriptomics, and automatic cell classification.</p><p><strong>Availability and implementation: </strong>Software: https://yunwilliamyu.github.io/SlowMoMan/; Code: https://github.com/yunwilliamyu/SlowMoMan.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae095"},"PeriodicalIF":2.4,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11220466/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141499797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2024-06-20eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae077
Serhan Yılmaz, Filipa Blasco Tavares Pereira Lopes, Daniela Schlatzer, Marzieh Ayati, Mark R Chance, Mehmet Koyutürk
{"title":"Making proteomics accessible: RokaiXplorer for interactive analysis of phospho-proteomic data.","authors":"Serhan Yılmaz, Filipa Blasco Tavares Pereira Lopes, Daniela Schlatzer, Marzieh Ayati, Mark R Chance, Mehmet Koyutürk","doi":"10.1093/bioadv/vbae077","DOIUrl":"10.1093/bioadv/vbae077","url":null,"abstract":"<p><strong>Summary: </strong>We present RokaiXplorer, an intuitive web tool designed to address the scarcity of user-friendly solutions for proteomics and phospho-proteomics data analysis and visualization. RokaiXplorer streamlines data processing, analysis, and visualization through an interactive online interface, making it accessible to researchers without specialized training in proteomics or data science. With its comprehensive suite of modules, RokaiXplorer facilitates phospho-proteomic analysis at the level of phosphosites, proteins, kinases, biological processes, and pathways. The tool offers functionalities such as data normalization, statistical testing, activity inference, pathway enrichment, subgroup analysis, automated report generation, and multiple visualizations, including volcano plots, bar plots, heat maps, and network views. As a unique feature, RokaiXplorer allows researchers to effortlessly deploy their own data browsers, enabling interactive sharing of research data and findings. Overall, RokaiXplorer fills an important gap in phospho-proteomic data analysis by providing the ability to comprehensively analyze data at multiple levels within a single application.</p><p><strong>Availability and implementation: </strong>Access RokaiXplorer at: http://explorer.rokai.io.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae077"},"PeriodicalIF":2.4,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11415779/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2024-06-19eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae082
Matthew Macaulay, Mathieu Fourment
{"title":"Differentiable phylogenetics <i>via</i> hyperbolic embeddings with Dodonaphy.","authors":"Matthew Macaulay, Mathieu Fourment","doi":"10.1093/bioadv/vbae082","DOIUrl":"10.1093/bioadv/vbae082","url":null,"abstract":"<p><strong>Motivation: </strong>Navigating the high dimensional space of discrete trees for phylogenetics presents a challenging problem for tree optimization. To address this, hyperbolic embeddings of trees offer a promising approach to encoding trees efficiently in continuous spaces. However, they require a differentiable tree decoder to optimize the phylogenetic likelihood. We present soft-NJ, a differentiable version of neighbour joining that enables gradient-based optimization over the space of trees.</p><p><strong>Results: </strong>We illustrate the potential for differentiable optimization over tree space for maximum likelihood inference. We then perform variational Bayesian phylogenetics by optimizing embedding distributions in hyperbolic space. We compare the performance of this approximation technique on eight benchmark datasets to state-of-the-art methods. Results indicate that, while this technique is not immune from local optima, it opens a plethora of powerful and parametrically efficient approach to phylogenetics <i>via</i> tree embeddings.</p><p><strong>Availability and implementation: </strong>Dodonaphy is freely available on the web at https://www.github.com/mattapow/dodonaphy. It includes an implementation of soft-NJ.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae082"},"PeriodicalIF":2.4,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11310108/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141918223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2024-06-18eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae092
Anna Kennedy, Ella Richardson, Jonathan Higham, Panagiotis Kotsantis, Richard Mort, Barbara Bo-Ju Shih
{"title":"Evergene: an interactive webtool for large-scale gene-centric analysis of primary tumours.","authors":"Anna Kennedy, Ella Richardson, Jonathan Higham, Panagiotis Kotsantis, Richard Mort, Barbara Bo-Ju Shih","doi":"10.1093/bioadv/vbae092","DOIUrl":"10.1093/bioadv/vbae092","url":null,"abstract":"<p><strong>Motivation: </strong>The data sharing of large comprehensive cancer research projects, such as The Cancer Genome Atlas (TCGA), has improved the availability of high-quality data to research labs around the world. However, due to the volume and inherent complexity of high-throughput omics data, analysis of this is limited by the capacity for performing data processing through programming languages such as R or Python. Existing webtools lack functionality that supports large-scale analysis; typically, users can only input one gene, or a gene list condensed into a gene set, instead of individual gene-level analysis. Furthermore, analysis results are usually displayed without other sample-level molecular or clinical annotations. To address these gaps in the existing webtools, we have developed Evergene using R and Shiny.</p><p><strong>Results: </strong>Evergene is a user-friendly webtool that utilizes RNA-sequencing data, alongside other sample and clinical annotation, for large-scale gene-centric analysis, including principal component analysis (PCA), survival analysis (SA), and correlation analysis (CA). Moreover, Evergene achieves in-depth analysis of cancer transcriptomic data which can be explored through dimensional reduction methods, relating gene expression with clinical events or other sample information, such as ethnicity, histological classification, and molecular indices. Lastly, users can upload custom data to Evergene for analysis.</p><p><strong>Availability and implementation: </strong>Evergene webtool is available at https://bshihlab.shinyapps.io/evergene/. The source code and example user input dataset are available at https://github.com/bshihlab/evergene.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae092"},"PeriodicalIF":2.4,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11213629/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141473192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2024-06-17eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae089
Priyanka Banerjee, Oliver Eulenstein, Iddo Friedberg
{"title":"Discovering genomic islands in unannotated bacterial genomes using sequence embedding.","authors":"Priyanka Banerjee, Oliver Eulenstein, Iddo Friedberg","doi":"10.1093/bioadv/vbae089","DOIUrl":"10.1093/bioadv/vbae089","url":null,"abstract":"<p><strong>Motivation: </strong>Genomic islands (GEIs) are clusters of genes in bacterial genomes that are typically acquired by horizontal gene transfer. GEIs play a crucial role in the evolution of bacteria by rapidly introducing genetic diversity and thus helping them adapt to changing environments. Specifically of interest to human health, many GEIs contain pathogenicity and antimicrobial resistance genes. Detecting GEIs is, therefore, an important problem in biomedical and environmental research. There have been many previous studies for computationally identifying GEIs. Still, most of these studies rely on detecting anomalies in the unannotated nucleotide sequences or on a fixed set of known features on annotated nucleotide sequences.</p><p><strong>Results: </strong>Here, we present TreasureIsland, which uses a new unsupervised representation of DNA sequences to predict GEIs. We developed a high-precision boundary detection method featuring an incremental fine-tuning of GEI borders, and we evaluated the accuracy of this framework using a new comprehensive reference dataset, Benbow. We show that TreasureIsland's accuracy rivals other GEI predictors, enabling efficient and faster identification of GEIs in unannotated bacterial genomes.</p><p><strong>Availability and implementation: </strong>TreasureIsland is available under an MIT license at: https://github.com/FriedbergLab/GenomicIslandPrediction.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae089"},"PeriodicalIF":2.4,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11193100/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141443854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2024-06-14eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae088
Genevieve R Krause, Walt Shands, Travis J Wheeler
{"title":"Sensitive and error-tolerant annotation of protein-coding DNA with BATH.","authors":"Genevieve R Krause, Walt Shands, Travis J Wheeler","doi":"10.1093/bioadv/vbae088","DOIUrl":"10.1093/bioadv/vbae088","url":null,"abstract":"<p><strong>Summary: </strong>We present BATH, a tool for highly sensitive annotation of protein-coding DNA based on direct alignment of that DNA to a database of protein sequences or profile hidden Markov models (pHMMs). BATH is built on top of the HMMER3 code base, and simplifies the annotation workflow for pHMM-based translated sequence annotation by providing a straightforward input interface and easy-to-interpret output. BATH also introduces novel frameshift-aware algorithms to detect frameshift-inducing nucleotide insertions and deletions (indels). BATH matches the accuracy of HMMER3 for annotation of sequences containing no errors, and produces superior accuracy to all tested tools for annotation of sequences containing nucleotide indels. These results suggest that BATH should be used when high annotation sensitivity is required, particularly when frameshift errors are expected to interrupt protein-coding regions, as is true with long-read sequencing data and in the context of pseudogenes.</p><p><strong>Availability and implementation: </strong>The software is available at https://github.com/TravisWheelerLab/BATH.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae088"},"PeriodicalIF":2.4,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11223822/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141536125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OM2Seq: learning retrieval embeddings for optical genome mapping.","authors":"Yevgeni Nogin, Danielle Sapir, Tahir Detinis Zur, Nir Weinberger, Yonatan Belinkov, Yuval Ebenstein, Yoav Shechtman","doi":"10.1093/bioadv/vbae079","DOIUrl":"10.1093/bioadv/vbae079","url":null,"abstract":"<p><strong>Motivation: </strong>Genomics-based diagnostic methods that are quick, precise, and economical are essential for the advancement of precision medicine, with applications spanning the diagnosis of infectious diseases, cancer, and rare diseases. One technology that holds potential in this field is optical genome mapping (OGM), which is capable of detecting structural variations, epigenomic profiling, and microbial species identification. It is based on imaging of linearized DNA molecules that are stained with fluorescent labels, that are then aligned to a reference genome. However, the computational methods currently available for OGM fall short in terms of accuracy and computational speed.</p><p><strong>Results: </strong>This work introduces OM2Seq, a new approach for the rapid and accurate mapping of DNA fragment images to a reference genome. Based on a Transformer-encoder architecture, OM2Seq is trained on acquired OGM data to efficiently encode DNA fragment images and reference genome segments to a common embedding space, which can be indexed and efficiently queried using a vector database. We show that OM2Seq significantly outperforms the baseline methods in both computational speed (by 2 orders of magnitude) and accuracy.</p><p><strong>Availability and implementation: </strong>https://github.com/yevgenin/om2seq.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae079"},"PeriodicalIF":2.4,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11194751/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2024-05-30eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae081
Corentin Thuilliez, Gaël Moquin-Beaudry, Pierre Khneisser, Maria Eugenia Marques Da Costa, Slim Karkar, Hanane Boudhouche, Damien Drubay, Baptiste Audinot, Birgit Geoerger, Jean-Yves Scoazec, Nathalie Gaspar, Antonin Marchais
{"title":"CellsFromSpace: a fast, accurate, and reference-free tool to deconvolve and annotate spatially distributed omics data.","authors":"Corentin Thuilliez, Gaël Moquin-Beaudry, Pierre Khneisser, Maria Eugenia Marques Da Costa, Slim Karkar, Hanane Boudhouche, Damien Drubay, Baptiste Audinot, Birgit Geoerger, Jean-Yves Scoazec, Nathalie Gaspar, Antonin Marchais","doi":"10.1093/bioadv/vbae081","DOIUrl":"10.1093/bioadv/vbae081","url":null,"abstract":"<p><strong>Motivation: </strong>Spatial transcriptomics enables the analysis of cell crosstalk in healthy and diseased organs by capturing the transcriptomic profiles of millions of cells within their spatial contexts. However, spatial transcriptomics approaches also raise new computational challenges for the multidimensional data analysis associated with spatial coordinates.</p><p><strong>Results: </strong>In this context, we introduce a novel analytical framework called CellsFromSpace based on independent component analysis (ICA), which allows users to analyze various commercially available technologies without relying on a single-cell reference dataset. The ICA approach deployed in CellsFromSpace decomposes spatial transcriptomics data into interpretable components associated with distinct cell types or activities. ICA also enables noise or artifact reduction and subset analysis of cell types of interest through component selection. We demonstrate the flexibility and performance of CellsFromSpace using real-world samples to demonstrate ICA's ability to successfully identify spatially distributed cells as well as rare diffuse cells, and quantitatively deconvolute datasets from the Visium, Slide-seq, MERSCOPE, and CosMX technologies. Comparative analysis with a current alternative reference-free deconvolution tool also highlights CellsFromSpace's speed, scalability and accuracy in processing complex, even multisample datasets. CellsFromSpace also offers a user-friendly graphical interface enabling non-bioinformaticians to annotate and interpret components based on spatial distribution and contributor genes, and perform full downstream analysis.</p><p><strong>Availability and implementation: </strong>CellsFromSpace (CFS) is distributed as an R package available from github at https://github.com/gustaveroussy/CFS along with tutorials, examples, and detailed documentation.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae081"},"PeriodicalIF":2.4,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11194756/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2024-05-29eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae080
Franziska Lang, Patrick Sorn, Martin Suchan, Alina Henrich, Christian Albrecht, Nina Köhl, Aline Beicht, Pablo Riesgo-Ferreiro, Christoph Holtsträter, Barbara Schrörs, David Weber, Martin Löwer, Ugur Sahin, Jonas Ibn-Salem
{"title":"Prediction of tumor-specific splicing from somatic mutations as a source of neoantigen candidates.","authors":"Franziska Lang, Patrick Sorn, Martin Suchan, Alina Henrich, Christian Albrecht, Nina Köhl, Aline Beicht, Pablo Riesgo-Ferreiro, Christoph Holtsträter, Barbara Schrörs, David Weber, Martin Löwer, Ugur Sahin, Jonas Ibn-Salem","doi":"10.1093/bioadv/vbae080","DOIUrl":"10.1093/bioadv/vbae080","url":null,"abstract":"<p><strong>Motivation: </strong>Neoantigens are promising targets for cancer immunotherapies and might arise from alternative splicing. However, detecting tumor-specific splicing is challenging because many non-canonical splice junctions identified in tumors also appear in healthy tissues. To increase tumor-specificity, we focused on splicing caused by somatic mutations as a source for neoantigen candidates in individual patients.</p><p><strong>Results: </strong>We developed the tool splice2neo with multiple functionalities to integrate predicted splice effects from somatic mutations with splice junctions detected in tumor RNA-seq and to annotate the resulting transcript and peptide sequences. Additionally, we provide the tool EasyQuant for targeted RNA-seq read mapping to candidate splice junctions. Using a stringent detection rule, we predicted 1.7 splice junctions per patient as splice targets with a false discovery rate below 5% in a melanoma cohort. We confirmed tumor-specificity using independent, healthy tissue samples. Furthermore, using tumor-derived RNA, we confirmed individual exon-skipping events experimentally. Most target splice junctions encoded neoepitope candidates with predicted major histocompatibility complex (MHC)-I or MHC-II binding. Compared to neoepitope candidates from non-synonymous point mutations, the splicing-derived MHC-I neoepitope candidates had lower self-similarity to corresponding wild-type peptides. In conclusion, we demonstrate that identifying mutation-derived, tumor-specific splice junctions can lead to additional neoantigen candidates to expand the target repertoire for cancer immunotherapies.</p><p><strong>Availability and implementation: </strong>The R package splice2neo and the python package EasyQuant are available at https://github.com/TRON-Bioinformatics/splice2neo and https://github.com/TRON-Bioinformatics/easyquant, respectively.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae080"},"PeriodicalIF":0.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11165244/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141307454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics advancesPub Date : 2024-05-29eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae064
Nanxi Guo, Juan Vargas, Samantha Reynoso, Douglas Fritz, Revanth Krishna, Chuangqi Wang, Fan Zhang
{"title":"Uncover spatially informed variations for single-cell spatial transcriptomics with STew.","authors":"Nanxi Guo, Juan Vargas, Samantha Reynoso, Douglas Fritz, Revanth Krishna, Chuangqi Wang, Fan Zhang","doi":"10.1093/bioadv/vbae064","DOIUrl":"10.1093/bioadv/vbae064","url":null,"abstract":"<p><strong>Motivation: </strong>The recent spatial transcriptomics (ST) technologies have enabled characterization of gene expression patterns and spatial information, advancing our understanding of cell lineages within diseased tissues. Several analytical approaches have been proposed for ST data, but effectively utilizing spatial information to unveil the shared variation with gene expression remains a challenge.</p><p><strong>Results: </strong>We introduce STew, a Spatial Transcriptomic multi-viEW representation learning method, to jointly analyze spatial information and gene expression in a scalable manner, followed by a data-driven statistical framework to measure the goodness of model fit. Through benchmarking using human dorsolateral prefrontal cortex and mouse main olfactory bulb data with true manual annotations, STew achieved superior performance in both clustering accuracy and continuity of identified spatial domains compared with other methods. STew is also robust to generate consistent results insensitive to model parameters, including sparsity constraints. We next applied STew to various ST data acquired from 10× Visium, Slide-seqV2, and 10× Xenium, encompassing single-cell and multi-cellular resolution ST technologies, which revealed spatially informed cell type clusters and biologically meaningful axes. In particular, we identified a proinflammatory fibroblast spatial niche using ST data from psoriatic skins. Moreover, STew scales almost linearly with the number of spatial locations, guaranteeing its applicability to datasets with thousands of spatial locations to capture disease-relevant niches in complex tissues.</p><p><strong>Availability and implementation: </strong>Source code and the R software tool STew are available from github.com/fanzhanglab/STew.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae064"},"PeriodicalIF":0.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11142628/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}