Jūratė Šaltytė Benth, Fred Espen Benth, Espen Rostrup Nakstad
{"title":"Rebuttal to Flaws in the Paper 'Nearly Instantaneous Time-Varying Reproduction Number for Contagious Diseases-a Direct Approach Based on Nonlinear Regression'.","authors":"Jūratė Šaltytė Benth, Fred Espen Benth, Espen Rostrup Nakstad","doi":"10.1089/cmb.2025.0024","DOIUrl":"https://doi.org/10.1089/cmb.2025.0024","url":null,"abstract":"","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":"32 8","pages":"819-823"},"PeriodicalIF":1.6,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144753527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexandra Sasha Gavryushkina, Holly R Pinkney, Sarah D Diermeier, Alex Gavryushkin
{"title":"Filtering for Highly Variable Genes and High-Quality Spots Improves Phylogenetic Analysis of Cancer Spatial Transcriptomics Visium Data.","authors":"Alexandra Sasha Gavryushkina, Holly R Pinkney, Sarah D Diermeier, Alex Gavryushkin","doi":"10.1089/cmb.2024.0614","DOIUrl":"10.1089/cmb.2024.0614","url":null,"abstract":"<p><p>Phylogenetic relationship of cells within tumors can help us to understand how cancer develops in space and time and identify driver mutations and other evolutionary events that enable cancer growth and spread. Numerous studies have reconstructed phylogenies from single-cell DNA-seq data. Here, we are looking into the problem of phylogenetic analysis of spatially resolved near single-cell RNA-seq data, which is a cost-efficient alternative (or complementary) data source that integrates multiple sources of evolutionary information, including point mutations, copy number changes, and epimutations. Recent attempts to use such data, although promising, raised many methodological challenges. Here, we explored data preprocessing and modeling approaches for evolutionary analyses of Visium spatial transcriptomics data. We conclude that using only highly variable genes and accounting for heterogeneous RNA capture across tissue-covered spots improves the reconstructed topological relationships and influences estimated branch lengths.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"738-752"},"PeriodicalIF":1.6,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144266391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Geir Storvik, Solveig Engebretsen, Birgitte Freiesleben de Blasio, Arnoldo Frigessi
{"title":"Flaws in the Article \"Nearly Instantaneous Time-Varying Reproduction Number for Contagious Diseases-a Direct Approach Based on Nonlinear Regression\".","authors":"Geir Storvik, Solveig Engebretsen, Birgitte Freiesleben de Blasio, Arnoldo Frigessi","doi":"10.1177/15578666251360613","DOIUrl":"https://doi.org/10.1177/15578666251360613","url":null,"abstract":"","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":"32 8","pages":"813-818"},"PeriodicalIF":1.6,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144753526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mukul S Bansal, Wei Chen, Yury Khudyakov, Ion I Măndoiu, Marmar R Moussa, Murray Patterson, Sanguthevar Rajasekaran, Pavel Skums, Sharma V Thankachan, Alex Zelikovsky
{"title":"<i>Special Section:</i> 12th International Computational Advances in Bio and Medical Sciences (ICCABS 2023).","authors":"Mukul S Bansal, Wei Chen, Yury Khudyakov, Ion I Măndoiu, Marmar R Moussa, Murray Patterson, Sanguthevar Rajasekaran, Pavel Skums, Sharma V Thankachan, Alex Zelikovsky","doi":"10.1089/cmb.2025.0124","DOIUrl":"10.1089/cmb.2025.0124","url":null,"abstract":"","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"721-722"},"PeriodicalIF":1.6,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144078209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Exact Matching Method for 16S rRNA Taxonomy Classification.","authors":"Sing-Hoi Sze","doi":"10.1089/cmb.2024.0615","DOIUrl":"10.1089/cmb.2024.0615","url":null,"abstract":"<p><p>One popular approach to taxonomy classification in the microbiome utilizes 16S ribosomal RNA sequences. The main challenge is that 16S rRNA sequences could be almost identical in closely related species, and it is difficult to distinguish them at the species level. Recent approaches are able to achieve almost single nucleotide resolution by constructing an error model of the reads. We develop an exact matching algorithm to utilize the single nucleotide resolution directly. We show that our algorithm is able to obtain improved accuracy in recent samples of mock communities and in samples of high compositional complexity when compared to existing algorithms. A software program implementing this algorithm is available at http://faculty.cse.tamu.edu/shsze/kmpmatch.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"753-760"},"PeriodicalIF":1.6,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144248127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Algorithm to Calculate the <i>p</i>-Value of the Monge-Elkan Distance.","authors":"Petr Ryšavý, Filip Železný","doi":"10.1089/cmb.2024.0854","DOIUrl":"10.1089/cmb.2024.0854","url":null,"abstract":"<p><p>The Monge-Elkan distance is a straightforward yet popular distance measure used to estimate the mutual similarity of two sets of objects. It was initially proposed in the field of databases, and it found broad usage in other fields. Nowadays, it is especially relevant to the analysis of new-generation sequencing data as it represents a measure of dissimilarity between genomes of two distinct organisms, particularly when applied to unassembled reads. This article provides an algorithm to calculate the <i>p</i>-value associated with the Monge-Elkan distance. Given the object-level null distribution, that is, the distribution of distances between independently and identically sampled objects such as reads, the method yields the null distribution of the Monge-Elkan distance, which in turn allows for calculating the <i>p</i>-value. We also demonstrate an application on sequencing data, where individual reads are compared by the Levenshtein distance.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"797-812"},"PeriodicalIF":1.6,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144248164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Mapper Algorithm with Implicit Intervals and Its Optimization.","authors":"Yuyang Tao, Shufei Ge","doi":"10.1089/cmb.2024.0919","DOIUrl":"10.1089/cmb.2024.0919","url":null,"abstract":"<p><p>The Mapper algorithm is an essential tool for visualizing complex, high-dimensional data in topological data analysis and has been widely used in biomedical research. It outputs a combinatorial graph whose structure encodes the shape of the data. However, the need for manual parameter tuning and fixed (implicit) intervals, along with fixed overlapping ratios, may impede the performance of the standard Mapper algorithm. Variants of the standard Mapper algorithms have been developed to address these limitations, yet most of them still require manual tuning of parameters. Additionally, many of these variants, including the standard version found in the literature, were built within a deterministic framework and overlooked the uncertainty inherent in the data. To relax these limitations, in this work, we introduce a novel framework that implicitly represents intervals through a hidden assignment matrix, enabling automatic parameter optimization via stochastic gradient descent (SGD). In this work, we develop a soft Mapper framework based on a Gaussian mixture model for flexible and implicit interval construction. We further illustrate the robustness of the soft Mapper algorithm by introducing the Mapper graph mode as a point estimation for the output graph. Moreover, a SGD algorithm with a specific topological loss function is proposed for optimizing parameters in the model. Both simulation and application studies demonstrate its effectiveness in capturing the underlying topological structures. In addition, the application to an RNA expression dataset obtained from the Mount Sinai/JJ Peters VA Medical Center Brain Bank successfully identifies a distinct subgroup of Alzheimer's Disease. The implementation of our method is available at https://github.com/FarmerTao/Implicit-interval-Mapper.git.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"781-796"},"PeriodicalIF":1.6,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144225638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sc-TUSV-Ext: Single-Cell Clonal Lineage Inference from Single Nucleotide Variants, Copy Number Alterations, and Structural Variants.","authors":"Nishat Anjum Bristy, Xuecong Fu, Russell Schwartz","doi":"10.1089/cmb.2024.0613","DOIUrl":"10.1089/cmb.2024.0613","url":null,"abstract":"<p><p>Clonal lineage inference (\"tumor phylogenetics\") has become a crucial tool for making sense of somatic evolution processes that underlie cancer development and are increasingly recognized as part of normal tissue growth and aging. The inference of clonal lineage trees from single-cell sequence data offers particular promise for revealing processes of somatic evolution in unprecedented detail. However, most such tools are based on fairly restrictive models of the types of mutation events observed in somatic evolution and of the processes by which they develop. The present work seeks to enhance the power and versatility of tools for single-cell lineage reconstruction by making more comprehensive use of the range of molecular variant types by which tumors evolve. We introduce Sc-TUSV-ext, an integer linear programming-based tumor phylogeny reconstruction method that, for the first time, integrates single nucleotide variants, copy number alterations, and structural variations into clonal lineage reconstruction from single-cell DNA sequencing data. We show on synthetic data that accounting for these variant types collectively leads to improved accuracy in clonal lineage reconstruction relative to prior methods that consider only subsets of the variant types. We further demonstrate the effectiveness of real data in resolving clonal evolution in the presence of multiple variant types, providing a path toward more comprehensive insight into how various forms of somatic mutability collectively shape tissue development.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"723-737"},"PeriodicalIF":1.6,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143573009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Data Compression: Recent Innovations in LZ77 Algorithms.","authors":"Aaron Hong, Christina Boucher","doi":"10.1089/cmb.2024.0879","DOIUrl":"10.1089/cmb.2024.0879","url":null,"abstract":"<p><p>The growing volume of genomic data, driven by advances in sequencing technologies, demands efficient data compression solutions. Traditional algorithms, such as Lempel-Ziv77 (LZ77), have been fundamental in offering lossless compression, yet they often fall short when applied to the highly repetitive structures typical of genomic sequences. This review explores the evolution of LZ77 and its adaptations for genomic data compression, highlighting specialized algorithms designed to handle redundancy in large-scale sequencing datasets efficiently. Innovations in this field have enhanced compression ratios and processing efficiencies leveraging intrinsic redundancy within genomic datasets. We critically examine a spectrum of LZ77-based algorithms, including newer adaptations for external and semi-external memory settings, and contrast their efficacy in managing large-scale genomic data. We conducted experiments to evaluate the performance of several algorithms, including KKP2, RLE-LZ, SE-KKP, BGone, and PFP-LZ77, on both real-world datasets from the Pizza&Chili repetitive corpus, Salmonella genomes, and human chromosome 19 genomes. These results underscore the trade-offs between time and memory consumption between algorithms. This article aims to provide a comprehensive guide on the current landscape and future directions of data compression technologies, equipping bioinformaticians and other practitioners with insight to tackle the escalating data challenges in genomics and beyond.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"761-780"},"PeriodicalIF":1.6,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144181663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Network-Guided Sparse Subspace Clustering on Single-Cell Data.","authors":"Chenyang Yuan, Shunzhou Jiang, Songyun Li, Jicong Fan, Tianwei Yu","doi":"10.1177/15578666251359688","DOIUrl":"https://doi.org/10.1177/15578666251359688","url":null,"abstract":"<p><p>With the rapid development of single-cell RNA sequencing (scRNA-seq) technology, researchers can now investigate gene expression at the individual cell level. Identifying cell types via unsupervised clustering is a fundamental challenge in analyzing single-cell data. However, due to the high dimensionality of expression profiles, traditional clustering methods often fail to produce satisfactory results. To address this problem, we developed NetworkSSC, a network-guided sparse subspace clustering (SSC) approach. NetworkSSC operates on the same assumption as SSC that cells of the same type have gene expressions lying within the same subspace. In addition, it integrates a regularization term incorporating the gene network's Laplacian matrix, which captures functional associations between genes. Comparative analysis on nine scRNA-seq datasets shows that NetworkSSC outperforms traditional SSC and other unsupervised methods in most cases.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144637103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}