Journal of Computational Biology最新文献

筛选
英文 中文
Model Selection and Parameter Estimation for Fractional SIR Model Based on the Combination of Reinforcement Learning and ABC-SMC. 基于强化学习和ABC-SMC相结合的分数阶SIR模型选择与参数估计。
IF 1.6 4区 生物学
Journal of Computational Biology Pub Date : 2026-05-08 DOI: 10.1177/15578666261444393
Peiqi Chen, Wei Gu
{"title":"Model Selection and Parameter Estimation for Fractional SIR Model Based on the Combination of Reinforcement Learning and ABC-SMC.","authors":"Peiqi Chen, Wei Gu","doi":"10.1177/15578666261444393","DOIUrl":"https://doi.org/10.1177/15578666261444393","url":null,"abstract":"<p><p>A novel algorithm is proposed to increase the effectiveness of model selection and parameter estimation for the fractional susceptible-infected-recovered model. It combines reinforcement learning (RL) and approximate Bayesian computation sequential Monte Carlo (ABC-SMC) instead of ABC to improve the process of model selection and parameter estimation, where RL is used for model selection and ABC-SMC is exploited for parameter estimation. Numerical simulations illustrate that the combined algorithm (RL-ABC-SMC) significantly outperforms the ABC-SMC algorithm in terms of model selection. Finally, we consider the application of the proposed methodology.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"15578666261444393"},"PeriodicalIF":1.6,"publicationDate":"2026-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147838638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TransGAT-DTI: Transformer and Graph Attention Network for Drug-Target Interaction Prediction. TransGAT-DTI:用于药物-靶标相互作用预测的变压器和图注意网络。
IF 1.6 4区 生物学
Journal of Computational Biology Pub Date : 2026-05-08 DOI: 10.1177/15578666261445112
Changjian Zhou, Shuoxiang Wang, Yujie Zhong, Wensheng Xiang
{"title":"TransGAT-DTI: Transformer and Graph Attention Network for Drug-Target Interaction Prediction.","authors":"Changjian Zhou, Shuoxiang Wang, Yujie Zhong, Wensheng Xiang","doi":"10.1177/15578666261445112","DOIUrl":"https://doi.org/10.1177/15578666261445112","url":null,"abstract":"<p><p>Drug-target interaction (DTI) prediction is of great practical value for discovering, developing, and repurposing drugs, which has tremendous advantages to pharmaceutical industries and patients. However, the prediction of DTIs using wet-lab experimental methods is generally expensive and time-consuming. To date, numerous machine learning-based approaches show promising performance, which greatly improves DTI discovery efficiency, but there are still two challenges remained. One is how to represent the spatial structure features of drugs and targets appropriately, and the other is how to explicitly and effectively model and learn local interactions between drugs and targets, for better interpretation and prediction. In this work, we propose a novel framework that combines a transformer and graph attention convolutional network for DTI prediction (TransGAT-DTI), which not only precisely predicts the putative DTIs with satisfactory overall performance on three benchmark datasets, but also captures the pivotal sequence that contributes the most to the positive predictions. Experimental results demonstrate that the proposed TransGAT-DTI achieves the best performance on BindingDB, BioSNAP, and Human datasets. Importantly, the proposed approach provides a novel solution for discovering and developing novel drugs.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"15578666261445112"},"PeriodicalIF":1.6,"publicationDate":"2026-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147838630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simulating Protein Dynamics in Cell Signaling Pathways: A Mathematical Model Approach Incorporating Negative Interaction Mechanisms. 模拟细胞信号通路中的蛋白质动力学:一种包含负相互作用机制的数学模型方法。
IF 1.6 4区 生物学
Journal of Computational Biology Pub Date : 2026-04-27 DOI: 10.1177/15578666261443351
Minsoo Kim, Eunjung Kim
{"title":"Simulating Protein Dynamics in Cell Signaling Pathways: A Mathematical Model Approach Incorporating Negative Interaction Mechanisms.","authors":"Minsoo Kim, Eunjung Kim","doi":"10.1177/15578666261443351","DOIUrl":"https://doi.org/10.1177/15578666261443351","url":null,"abstract":"<p><p>This study presents an improved mathematical model that incorporates negative interaction mechanisms to predict the dynamics of cell signaling pathways. By employing stochastic differential equations and the Euler-Maruyama method, we simulate the responses of proteins within the mitogen-activated protein kinase and oxytocin signaling pathways over time. Conventional signaling models that consider only positive interactions often lead to unrealistic signal over-amplification, the absence of oscillatory dynamics, and an inability to reproduce compensatory responses following targeted inhibition. To address these limitations, our model explicitly incorporates inhibitory interactions through a sign-changing characteristic function and bounds protein activity using a hyperbolic-tangent transfer function, ensuring biologically plausible saturation behavior. Our findings indicate that the inhibition of upstream proteins such as MEK1/2 leads to a rapid decrease in ERK1/2 activation while causing a compensatory increase in other proteins such as SOS, RAS, and RAF. Furthermore, we explore the synergistic effects of combination therapies, demonstrating that targeting multiple signaling pathways can enhance therapeutic efficacy. Through the application of the Bliss Independence Index, we assess the effectiveness of these therapeutic combinations. Additionally, we investigate the effects of abnormal activation increases caused by gain-of-function mutations on downstream proteins and the resulting changes in balance induced by negative interactions. Overall, our enhanced mathematical model serves as a valuable tool for simulating signaling dynamics with inhibitory crosstalk and for generating mechanistic hypotheses relevant to targeted and combination therapies.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"15578666261443351"},"PeriodicalIF":1.6,"publicationDate":"2026-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147772883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Random Projection Methods Outperform Principal Component Analysis for Dimensionality Reduction in Single Cell RNA-Seq. 随机投影方法在单细胞RNA-Seq降维方面优于主成分分析。
IF 1.6 4区 生物学
Journal of Computational Biology Pub Date : 2026-04-27 DOI: 10.1177/15578666261436821
Mohamed Abdelnaby, Marmar R Moussa
{"title":"Random Projection Methods Outperform Principal Component Analysis for Dimensionality Reduction in Single Cell RNA-Seq.","authors":"Mohamed Abdelnaby, Marmar R Moussa","doi":"10.1177/15578666261436821","DOIUrl":"https://doi.org/10.1177/15578666261436821","url":null,"abstract":"<p><p>Principal component analysis (PCA) is one of the most frequently used dimensionality reduction methods for high-dimensional datasets, especially single-cell RNA sequencing (scRNA-seq). Despite its popularity, PCA faces challenges, particularly related to its performance degrading as the dataset size increases. Additionally, PCA is sensitive to outliers and assumes linearity. Random projection (RP) methods have emerged as a promising alternative to address several of PCA's limitations. In this study, we conduct a systematic and comprehensive evaluation of PCA and RP methods, including singular value decomposition (SVD) and randomized SVD approaches, against multiple RP methods including sparse random projection, Gaussian random projection, and we introduce a Matching Sparsity Random Projection algorithm that adaptively calibrates projection matrix density according to input data sparsity patterns, emphasizing both computational scalability and effectiveness in downstream analytical tasks. We evaluated these methods on multiple publicly available scRNA-seq datasets that include both labeled and unlabeled scenarios. Clustering performance is assessed using Hierarchical Clustering and Spherical K-Means algorithms, with labeled datasets evaluated through Hungarian algorithm accuracy and Mutual Information metrics. For unlabeled datasets, we used the Dunn Index and Gap Statistic to quantify cluster separation quality. Across both dataset types, the Within-Cluster Sum of Squares metric is used to assess variability. Moreover, locality preservation is examined, with RP methods, including our adaptive sparsity approach, outperforming PCA in several of the evaluated metrics. Our experimental results show that RP methods not only deliver substantial computational speed improvements over PCA but also rival, and in some cases, exceed PCA in preserving data variability and clustering quality. Through this comprehensive methodological comparison, our work provides critical guidance for selecting appropriate dimensionality reduction strategies that effectively balance computational demands, scalability requirements, and analytical quality in downstream analyses.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"15578666261436821"},"PeriodicalIF":1.6,"publicationDate":"2026-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147772888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating Protein Language Model Embeddings for Viral Clade Assignment. 评估蛋白质语言模型嵌入的病毒分支分配。
IF 1.6 4区 生物学
Journal of Computational Biology Pub Date : 2026-04-21 DOI: 10.1177/15578666261443336
Brendonas Stakauskas, Virginijus MarcinkeviČius
{"title":"Evaluating Protein Language Model Embeddings for Viral Clade Assignment.","authors":"Brendonas Stakauskas, Virginijus MarcinkeviČius","doi":"10.1177/15578666261443336","DOIUrl":"https://doi.org/10.1177/15578666261443336","url":null,"abstract":"<p><p>Protein language models (PLMs) provide powerful sequence representations, yet their effectiveness for unsupervised viral clade assignment remains uncertain. In this study, we evaluated embeddings from ProtT5, ProtBert, CARP, and several ESM-2 variants on influenza A/H3N2 hemagglutinin sequences. Using dimensionality reduction (t-SNE, UMAP, PCA, MDS) and clustering with HDBSCAN, we compared PLM embeddings against baseline Hamming distance approaches. Our results show that t-SNE combined with PLM embeddings can recover clade structure, with ProtBert yielding the most stable performance and larger ESM-2 models occasionally achieving lower normalized variation of information scores but with greater variability. These findings suggest that while PLM embeddings capture clade-relevant signals, they also suffer from instability and the loss of site- or nucleotide-specific detail. Future improvements in pooling strategies may enhance their utility for viral surveillance.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"15578666261443336"},"PeriodicalIF":1.6,"publicationDate":"2026-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147729237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DANCE: Deep Learning-Assisted Analysis of ProteiN Sequences Using Chaos Enhanced Kaleidoscopic Images. 舞蹈:使用混沌增强万花筒图像的蛋白质序列的深度学习辅助分析。
IF 1.6 4区 生物学
Journal of Computational Biology Pub Date : 2026-04-17 DOI: 10.1177/15578666261441311
Taslim Murad, Prakash Chourasia, Sarwan Ali, Imdad Ullah Khan, Murray Patterson
{"title":"DANCE: Deep Learning-Assisted Analysis of ProteiN Sequences Using Chaos Enhanced Kaleidoscopic Images.","authors":"Taslim Murad, Prakash Chourasia, Sarwan Ali, Imdad Ullah Khan, Murray Patterson","doi":"10.1177/15578666261441311","DOIUrl":"https://doi.org/10.1177/15578666261441311","url":null,"abstract":"<p><p>Cancer is a complex disease characterized by uncontrolled cell growth and requires an accurate classification for effective treatment. T cell receptors (TCRs), crucial proteins in the immune system, play a pivotal role in antigen recognition. Advancements in sequencing technologies have facilitated the comprehensive profiling of TCR repertoires, uncovering TCRs with potent anticancer activity and enabling TCR-based immunotherapies. Performing an effective analysis of these complex biomolecules requires representations that accurately capture both their structural and functional characteristics. T cell protein sequences pose unique challenges because of their relatively shorter lengths compared to other biomolecules. Traditional vector-based embedding methods may encounter issues such as information loss. Therefore, an image-based representation approach becomes a preferred choice for efficient embedding, allowing the preservation of essential details and enabling a comprehensive analysis of T cell protein sequences. We propose generating images from protein sequences using the concept of chaos game representation (CGR). We design images using the kaleidoscopic images approach. This Deep Learning-Assisted Analysis of ProteiN Sequences Using Chaos Enhanced Kaleidoscopic Images (called DANCE) provides a unique way to visualize protein sequences by recursively applying chaos game rules around a central seed point. The resulting kaleidoscopic images exhibit symmetrical patterns that offer a visual representation of the protein sequences. To investigate the effectiveness of this approach, we perform classification of the TCR protein sequences in terms of their respective target cancer cells, since TCRs are known for their immune response against cancer disease. The DANCE technique is used to turn the TCR sequences into pictures before classification. We employ deep learning (DL) vision models to classify the generated images to obtain insight into the relationship between the visual patterns in the generated kaleidoscopic images and the underlying protein properties. By combining CGR-based image generation with DL classification, this study opens new possibilities in protein analysis.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"15578666261441311"},"PeriodicalIF":1.6,"publicationDate":"2026-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147698965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reconstructing Ancestral Non-Coding RNAs of Multiple Families Using Sequence and Structural Information with Tree Decomposition. 利用树分解的序列和结构信息重构多科祖先非编码rna。
IF 1.6 4区 生物学
Journal of Computational Biology Pub Date : 2026-04-01 Epub Date: 2026-03-10 DOI: 10.1177/15578666261423964
Songdi Hu, Vladimir Reinharz, Olivier Tremblay-Savard
{"title":"Reconstructing Ancestral Non-Coding RNAs of Multiple Families Using Sequence and Structural Information with Tree Decomposition.","authors":"Songdi Hu, Vladimir Reinharz, Olivier Tremblay-Savard","doi":"10.1177/15578666261423964","DOIUrl":"10.1177/15578666261423964","url":null,"abstract":"<p><p>The reconstruction of ancestral non-coding RNA (ncRNA) sequences is particularly challenging due to the main conservation forces being applied to the structure, rather than the sequence. Naively trying to preserve the structure during the reconstruction tends to produce ancestors that are more energetically fit to the structure than their descendants, a clear contradiction. While most sequences are associated to only one functional structure, RNA families have an old and complex history. It has been hypothesized that some ancestral RNAs were combining multiple functions, with multistable structures. At some point, a duplication event happened, and each copy subspecialized into a specific structure. To circumvent the bias introduced by reconstructing sequences when only one structure is conserved, we recently proposed an approach using substitution and base pair costs that focuses on simultaneously reconstructing the ancestor of two related ncRNA families, assuming that they were created by this process of duplication followed by subspecialization. In this work, we improve the previous approach by leveraging advances in tree decomposition algorithms to (1) incorporate simultaneously more constraints and positions in the reconstruction which (2) allows to use a more realistic energetic model. Results on simulated datasets demonstrate significant improvements in ancestral sequence inference accuracy while reducing the number of optimal sequences inferred by several orders of magnitude. On real datasets of RFam clans (<i>Glm</i> and <i>FinP-traJ</i>), we show that the new approach is able to infer fewer optimal ancestral sequences that are more fit to both structures compared with previous methods.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"499-517"},"PeriodicalIF":1.6,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147390115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Probability-Based Sequence Comparison Finds Pre-Eutherian Nuclear Mitochondrial DNA Segments in Mammalian Genomes. 基于概率的序列比较发现哺乳动物基因组中的前真兽核线粒体DNA片段。
IF 1.6 4区 生物学
Journal of Computational Biology Pub Date : 2026-04-01 Epub Date: 2026-02-02 DOI: 10.1177/15578666261416560
Muyao Huang, Martin C Frith
{"title":"Probability-Based Sequence Comparison Finds Pre-Eutherian Nuclear Mitochondrial DNA Segments in Mammalian Genomes.","authors":"Muyao Huang, Martin C Frith","doi":"10.1177/15578666261416560","DOIUrl":"10.1177/15578666261416560","url":null,"abstract":"<p><p>The insertion of mitochondrial genome-derived DNA sequences into the nuclear genome is a frequent event in organismal evolution, resulting in nuclear-mitochondrial DNA segments (NUMTs), which serve as a significant driving force for genome evolution. Once incorporated into the nuclear genome, some NUMTs can be conserved for extended periods and may potentially acquire novel cellular roles. However, current mainstream methods for detecting NUMTs are inefficient at identifying ancient and highly degraded NUMTs, leading to their prevalence and impact being underestimated. These ancient NUMTs likely play a far greater role in genetic functions than previously recognized, including contributing to the acquisition of functional exons. This study focuses on identifying ancient NUMTs in mammalian genomes using enhanced high-sensitivity sequence comparison methods. A sensitive and accurate NUMT searching pipeline was established, predicting 1013 NUMTs in the human reference genome, 364 (36%) of which are newly detected compared to the University of California, Santa Cruz (UCSC) reference human NUMTs database. Notably, 90 pre-eutherian human NUMTs were identified, representing significantly older NUMTs than previously reported, with origins dating back at least 100 million years. The most ancient mammalian NUMT could even date back over 160 million years, inserted into the nuclear genome of the common ancestor of therian mammals. This study provides a comprehensive exploration of the quantity and evolutionary history of mammalian NUMTs, paving the way for future research on endosymbiotic impact on the evolution of nuclear genomes.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"401-419"},"PeriodicalIF":1.6,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146105709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond Synteny: A Scalable Phylogenomics Method for Whole-Genome Duplication Detection. 超越Synteny:全基因组重复检测的可扩展系统基因组学方法。
IF 1.6 4区 生物学
Journal of Computational Biology Pub Date : 2026-04-01 Epub Date: 2026-03-18 DOI: 10.1177/15578666251415567
Reza Kalhor, Manuel Lafond, Celine Scornavacca
{"title":"Beyond Synteny: A Scalable Phylogenomics Method for Whole-Genome Duplication Detection.","authors":"Reza Kalhor, Manuel Lafond, Celine Scornavacca","doi":"10.1177/15578666251415567","DOIUrl":"10.1177/15578666251415567","url":null,"abstract":"<p><p>Gene duplication is a fundamental driver of species adaptation and the evolution of new functions, making the reconstruction of historical duplication events crucial for understanding evolutionary processes. Whole-genome duplications (WGDs), which duplicate all gene families simultaneously, have profoundly influenced the evolution of plants, yeast, and vertebrates. Genome-scale data, such as syntenic blocks and gene family counts, are commonly employed to infer WGDs. However, detecting ancient WGDs remains challenging, as their genomic signatures are often overshadowed by extensive rearrangements and gene losses. Phylogenetic reconciliation methods between species and gene trees offer a potential means of identifying such ancient events, but frequently assume independence among gene families. This can lead to missed detections of WGDs, where gene duplications are inherently interdependent. Phylogenomics reconciliation addresses this challenge by reconciling multiple gene families at once. Unfortunately, existing models often constrain the space of possible reconciliations, overlook gene losses resulting from fractionation, or depend on conserved synteny across multiple species. This limits the number of genes that can be analyzed concurrently.In this work, we explore a phylogenomics reconciliation model that avoids synteny reliance, explicitly incorporates gene losses, and permits flexible remapping of duplications. Reconciliation under this model is NP-hard, and existing algorithms lack the scalability for large-scale datasets. To address this need, we present novel algorithmic strategies that efficiently handle tens of thousands of gene trees-a level of scalability previously unattained. We also evaluate our approach against existing methods. Experiments on both simulations and real data show that traditional LCA-mapping can yield incorrect WGD predictions after fractionation, whereas our approach is more robust. By comparing predictions using true and reconstructed gene trees, we further show that reconstruction errors greatly affect method performance and that gene tree correction is necessary for reliable results. Real data tests also reveal that our approach can recover WGDs missed by other reconciliation methods.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"456-481"},"PeriodicalIF":1.6,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147473954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From Small Parsimony to Horizontal Gene Transfer: Inferring Horizontal Transfer and Gene Loss for Single-Origin Characters. 从小简约到水平基因转移:推断单起源性状的水平转移和基因损失。
IF 1.6 4区 生物学
Journal of Computational Biology Pub Date : 2026-04-01 Epub Date: 2026-04-09 DOI: 10.1177/15578666261426009
Alitzel López Sánchez, Guillaume E Scholz, Peter F Stadler, Manuel Lafond
{"title":"From Small Parsimony to Horizontal Gene Transfer: Inferring Horizontal Transfer and Gene Loss for Single-Origin Characters.","authors":"Alitzel López Sánchez, Guillaume E Scholz, Peter F Stadler, Manuel Lafond","doi":"10.1177/15578666261426009","DOIUrl":"10.1177/15578666261426009","url":null,"abstract":"<p><p>The simple underlying pattern of presence-absence of a character within a species tree provides useful steps to trace complex evolutionary histories. Character-based models such as <i>perfect transfer networks</i> and its galled variant aim to leverage this information to predict horizontal gene transfers. Under the assumption that characters have a single origin, are rarely lost, and can be transferred horizontally, they remain an efficient inference method for almost tree-like scenarios. Nevertheless, they can sometimes predict overly complicated scenarios, and its simplest structural variants are too restrictive for practical uses. With the goal of extending this model to include loss events, we present a Sankoff-Rousseau-like algorithm that aims to recover the simplest possible scenarios that combine gene transfers and losses using solely the single character information already contained in a given species tree. We establish a link between the small parsimony problem and the inference of scenarios with a minimum number of losses and transfers, allowing losses and transfers to have a user-defined penalization for this end. We also explore the utility of our model for tracing possible highways of gene transfers by presenting a real case study on a dataset of bacterial species and Kyoto Encyclopedia of Genes and Genome functions as characters.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"535-557"},"PeriodicalIF":1.6,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147638938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书