Journal of Computational Biology最新文献

筛选
英文 中文
Combined Topological Data Analysis and Geometric Deep Learning Reveal Niches by the Quantification of Protein Binding Pockets. 结合拓扑数据分析和几何深度学习揭示了量化蛋白质结合口袋的生态位。
IF 1.4 4区 生物学
Journal of Computational Biology Pub Date : 2025-05-28 DOI: 10.1089/cmb.2025.0076
Peiran Jiang, Jose Lugo-Martinez
{"title":"Combined Topological Data Analysis and Geometric Deep Learning Reveal Niches by the Quantification of Protein Binding Pockets.","authors":"Peiran Jiang, Jose Lugo-Martinez","doi":"10.1089/cmb.2025.0076","DOIUrl":"https://doi.org/10.1089/cmb.2025.0076","url":null,"abstract":"<p><p>Protein pockets are essential for many proteins to carry out their functions. Locating and measuring protein pockets, as well as studying the anatomy of pockets, helps us further understand protein function. Most research studies focus on learning either local or global information from protein structures. However, there is a lack of studies that leverage the power of integrating both local and global representations of these structures. In this work, we combine topological data analysis (TDA) and geometric deep learning (GDL) to analyze the putative protein pockets of enzymes. TDA captures blueprints of the global topological invariant of protein pockets, whereas GDL decomposes the fingerprints into building blocks of these pockets. This integration of local and global views provides a comprehensive and complementary understanding of the protein structural motifs (<i>niches</i> for short) within protein pockets. We also analyze the distribution of the building blocks making up the pocket and profile the predictive power of coupling local and global representations for the task of discriminating between enzymes and nonenzymes, as well as predicting the enzyme class. We demonstrate that our representation learning framework for macromolecules is particularly useful when the structure is known, and the scenarios heavily rely on local and global information.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144174109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effective Integration of Single-Cell Multi-Omics Data Using Improved Network-Based Integrative Clustering with Multigraph Regularization. 基于多图正则化的改进网络集成聚类的单细胞多组学数据有效集成。
IF 1.4 4区 生物学
Journal of Computational Biology Pub Date : 2025-05-22 DOI: 10.1089/cmb.2023.0460
Shunqin Zhang, Wei Kong, Shuaiqun Wang, Kai Wei, Kun Liu, Gen Wen, Yaling Yu
{"title":"Effective Integration of Single-Cell Multi-Omics Data Using Improved Network-Based Integrative Clustering with Multigraph Regularization.","authors":"Shunqin Zhang, Wei Kong, Shuaiqun Wang, Kai Wei, Kun Liu, Gen Wen, Yaling Yu","doi":"10.1089/cmb.2023.0460","DOIUrl":"https://doi.org/10.1089/cmb.2023.0460","url":null,"abstract":"<p><p>The purpose of integrating different omics data is to study cellular heterogeneity at the level of transcriptional regulation from different gene levels, which can effectively identify cell types and reveal the pathogenesis of Alzheimer's disease (AD) from two perspectives. However, implementing such algorithms faces challenges such as high data noise levels, increased dimensionality, and computational complexity. In this study, multigraph regularization constraints were introduced in the network-based integrative clustering algorithm (MGR-NIC) to remove redundant features and keep the geometry structures underlying the data by fusing two types of data (snRNA-seq and snATAC-seq) of glial cells from AD samples. The effectiveness of the MGR-NIC algorithm was validated using both simulation datasets and real datasets derived from various tissues. The MGR-NIC algorithm can improve clustering accuracy by selecting features that better represent the dataset's structure. The clustering results obtained with the MGR-NIC algorithm show strong consistency with the clustering results inherent to the published DLPFC dataset, while the classification results generated using the NIC algorithm often lead to cluster overlap when applied to the DLPFC dataset. We will use the same state-of-the-art algorithms for a comprehensive evaluation with our proposed MGR-NIC algorithm, including NIC, scAI, Multi-Omics Factor Analysis v2, and JSNMF. MGR-NIC is the most stable and reliable method, implying its robustness across different datasets and its reliability in yielding consistent and accurate results.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144119822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Traditional and Deep Machine Learning to Predict Emergency Room Triage Levels. 使用传统和深度机器学习预测急诊室分类水平。
IF 1.4 4区 生物学
Journal of Computational Biology Pub Date : 2025-05-22 DOI: 10.1089/cmb.2024.0632
Mehmet Yıldırım, Savaş Sezik, Ayşe Başar
{"title":"Using Traditional and Deep Machine Learning to Predict Emergency Room Triage Levels.","authors":"Mehmet Yıldırım, Savaş Sezik, Ayşe Başar","doi":"10.1089/cmb.2024.0632","DOIUrl":"https://doi.org/10.1089/cmb.2024.0632","url":null,"abstract":"<p><p>Accurate triage in emergency rooms is crucial for efficient patient care and resource allocation. We developed methods to predict triage levels using several traditional machine learning methods (logistic regression, random forest, XGBoost) and neural network deep learning-based approaches. These models were tested on a dataset from emergency department visits of patients at a local Turkish hospital; this dataset consists of both structured and unstructured data. Compared with previous work, our challenge was to build a predictive model that uses documents written in the Turkish language and that handles specific aspects of the Turkish medical system. Text embedding techniques such as Bag of Words, Word2Vec, and BERT-based embedding were used to process the unstructured patient complaints. We used a comprehensive set of features including patient history data and disease diagnosis within our predictive models, which included advanced neural network architectures such as convolutional neural networks, attention mechanisms, and long-short-term memory networks. Our results revealed that BERT embeddings significantly enhanced the performance of neural network models, while Word2Vec embeddings showed slight better results in traditional machine learning models. The most effective model was XGBoost combined with Word2Vec embeddings, achieving 86.7% AUC, 81.5% accuracy, and 68.7% weighted F1 score. We conclude that text embedding methods and machine learning methods are effective tools to predict emergency room triage levels. The integration of patient history into the models, alongside the strategic use of text embeddings, significantly improves predictive accuracy.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144119823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
XVir: A Transformer-Based Architecture for Identifying Viral Reads from Cancer Samples. XVir:用于从癌症样本中识别病毒读取的基于转换器的架构。
IF 1.4 4区 生物学
Journal of Computational Biology Pub Date : 2025-05-20 DOI: 10.1089/cmb.2025.0075
Shorya Consul, John Robertson, Haris Vikalo
{"title":"XVir: A Transformer-Based Architecture for Identifying Viral Reads from Cancer Samples.","authors":"Shorya Consul, John Robertson, Haris Vikalo","doi":"10.1089/cmb.2025.0075","DOIUrl":"https://doi.org/10.1089/cmb.2025.0075","url":null,"abstract":"<p><p>It is estimated that approximately 15% of cancers worldwide can be linked to viral infections. The viruses that can cause or increase the risk of cancer include human papillomavirus, hepatitis B and C viruses, Epstein-Barr virus, and human immunodeficiency virus, to name a few. The computational analysis of the massive amounts of tumor DNA data, whose collection is enabled by the advancements in sequencing technologies, has allowed studies of the potential association between cancers and viral pathogens. However, the high diversity of oncoviral families makes reliable detection of viral DNA difficult, and the training of machine learning models that enable such analysis computationally challenging. We introduce XVir, a data pipeline that deploys a transformer-based deep learning architecture to reliably identify viral DNA present in human tumors. XVir is trained on a mix of sequencing reads coming from viral and human genomes, resulting in a model capable of robust detection of potentially mutated viral DNA across a range of experimental settings. Results on semi-experimental data demonstrate that XVir is able to achieve high classification accuracy, generally outperforming state-of-the-art competing methods. In particular, it retains high accuracy even when faced with diverse viral populations while being significantly faster to train than other large deep learning-based classifiers.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144110335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative Performance Evaluation of Large Language Models for Extracting Molecular Interactions and Pathway Knowledge. 用于提取分子相互作用和途径知识的大型语言模型的比较性能评价。
IF 1.4 4区 生物学
Journal of Computational Biology Pub Date : 2025-05-19 DOI: 10.1089/cmb.2025.0078
Gilchan Park, Byung-Jun Yoon, Xihaier Luo, Vanessa López-Marrero, Shinjae Yoo, Shantenu Jha
{"title":"Comparative Performance Evaluation of Large Language Models for Extracting Molecular Interactions and Pathway Knowledge.","authors":"Gilchan Park, Byung-Jun Yoon, Xihaier Luo, Vanessa López-Marrero, Shinjae Yoo, Shantenu Jha","doi":"10.1089/cmb.2025.0078","DOIUrl":"https://doi.org/10.1089/cmb.2025.0078","url":null,"abstract":"<p><p>Understanding the interactions and regulatory relationships among biomolecules is essential for deciphering complex biological systems and elucidating the mechanisms behind diverse biological functions. Traditionally, the collection of such molecular interaction data has relied on expert curation, a process that is both time-consuming and labor-intensive. To address these limitations, this study explores the use of large language models (LLMs) to automate the genome-scale extraction of molecular interaction knowledge. We evaluate the performance of various LLMs on key biological tasks, including the identification of protein-protein interactions, detection of genes associated with pathways influenced by low-dose radiation, and inference of gene regulatory relationships. Our findings demonstrate that larger LLMs tend to perform better, particularly in extracting intricate gene and protein interactions. Despite their strengths, these models face challenges in recognizing functionally diverse gene groups and highly correlated regulatory relationships. Through a comprehensive analysis using established molecular interaction and pathway databases, we show that LLMs possess the potential to identify relevant biomolecules and predict their interactions, offering valuable insights and marking a significant step toward AI-driven biological knowledge discovery.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144093858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Special Section: 12th International Computational Advances in Bio and Medical Sciences (ICCABS 2023). 特别部分:第12届国际生物和医学科学计算进展(ICCABS 2023)。
IF 1.4 4区 生物学
Journal of Computational Biology Pub Date : 2025-05-16 DOI: 10.1089/cmb.2025.0124
Mukul S Bansal, Wei Chen, Yury Khudyakov, Ion I Măndoiu, Marmar R Moussa, Murray Patterson, Sanguthevar Rajasekaran, Pavel Skums, Sharma V Thankachan, Alex Zelikovsky
{"title":"<i>Special Section:</i> 12th International Computational Advances in Bio and Medical Sciences (ICCABS 2023).","authors":"Mukul S Bansal, Wei Chen, Yury Khudyakov, Ion I Măndoiu, Marmar R Moussa, Murray Patterson, Sanguthevar Rajasekaran, Pavel Skums, Sharma V Thankachan, Alex Zelikovsky","doi":"10.1089/cmb.2025.0124","DOIUrl":"https://doi.org/10.1089/cmb.2025.0124","url":null,"abstract":"","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144078209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The 2nd International Workshop on Pattern Recognition in Healthcare Analytics 2023 Preface. 第二届医疗保健分析模式识别国际研讨会2023前言。
IF 1.4 4区 生物学
Journal of Computational Biology Pub Date : 2025-05-13 DOI: 10.1089/cmb.2025.0117
Inci M Baytas
{"title":"The 2nd International Workshop on Pattern Recognition in Healthcare Analytics 2023 Preface.","authors":"Inci M Baytas","doi":"10.1089/cmb.2025.0117","DOIUrl":"https://doi.org/10.1089/cmb.2025.0117","url":null,"abstract":"","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143985486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the Influence of Gene Networks on Driver Gene Classification. 探讨基因网络对驱动基因分类的影响。
IF 1.4 4区 生物学
Journal of Computational Biology Pub Date : 2025-05-13 DOI: 10.1089/cmb.2025.0043
Paulo Henrique Ribeiro, Jorge Francisco Cutigi, Rodrigo Henrique Ramos, Cynthia de Oliveira Lage Ferreira, Adriane Feijo Evangelista, Adenilso da Silva Simao
{"title":"Exploring the Influence of Gene Networks on Driver Gene Classification.","authors":"Paulo Henrique Ribeiro, Jorge Francisco Cutigi, Rodrigo Henrique Ramos, Cynthia de Oliveira Lage Ferreira, Adriane Feijo Evangelista, Adenilso da Silva Simao","doi":"10.1089/cmb.2025.0043","DOIUrl":"https://doi.org/10.1089/cmb.2025.0043","url":null,"abstract":"<p><p>Cancer is a complex disease caused by mutations in the genome of cells. Genetic mutations can be divided into driver mutations, which are significant for the initiation and progression of cancer, and passenger mutations, which have a neutral effect. In recent years, computational methods have been developed to identify driver genes. Some of these methods use data from gene networks to classify the genes. However, the impact of different gene networks on the performance of these methods remains unexplored. This article aims to analyze the influence of genetic networks in driver gene classification. We analyzed driver gene classification methods that use gene networks as input data, using different cancer mutation datasets and distinct gene networks. Computational methods show significant variation in their results when different gene networks are employed. The results highlight the need to carefully interpret driver gene classification and emphasize the importance of using different gene networks. These findings underline the necessity of developing more robust computational approaches that account for network variability, ensuring greater reliability in driver gene identification and its applications in cancer research.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143994363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DrIVeNN: Drug Interaction Vectors Neural Network. 药物相互作用向量神经网络。
IF 1.4 4区 生物学
Journal of Computational Biology Pub Date : 2025-05-12 DOI: 10.1089/cmb.2025.0079
Natalie Wang, Casey Overby Taylor
{"title":"DrIVeNN: Drug Interaction Vectors Neural Network.","authors":"Natalie Wang, Casey Overby Taylor","doi":"10.1089/cmb.2025.0079","DOIUrl":"https://doi.org/10.1089/cmb.2025.0079","url":null,"abstract":"<p><p>Polypharmacy, the concurrent use of multiple drugs to treat a single condition, is common in patients managing multiple or complex conditions. However, as more drugs are added to the treatment plan, the risk of adverse drug events (ADEs) rises rapidly. Because it is impractical to test every possible drug combination during clinical trials, many serious polypharmacy ADEs (also known as drug-drug interactions or DDIs) only become known after the drugs are in use. This issue is prevalent among older adults with cardiovascular disease (CVD), where polypharmacy and ADEs are common. In this research, our primary objective was to identify key drug features and build and evaluate a model to predict DDIs. Our secondary objective was to assess our model on a domain-specific case study. We developed a two-layer neural network that incorporated drug features such as molecular structure, drug-protein interactions, and mono-drug side effects (drug interaction vectors neural network [DrIVeNN]) using publicly available side effect databases. It performed moderately better than state-of-the-art models such as DGNN-DDI, KGDDI, and NNPS. DrIVeNN had average area under the Receiver Operating Characteristic curve (AUROC) and area under the precision-recall curve (AUPRC) scores of 0.934 and 0.920, respectively, compared to the best-performing baseline model, DGNN-DDI, which had scores of 0.919 and 0.904. We also conducted a domain-specific case study centered on CVD treatment, and there was a significant increase in performance from the general model. We observed an average AUROC for CVD DDI prediction of 0.979. This research contributes to the advancement of predictive modeling techniques for polypharmacy ADEs and indicates the strong potential of domain-specific models.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144027348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generating Heterogeneous Data on Gene Trees. 在基因树上生成异构数据。
IF 1.4 4区 生物学
Journal of Computational Biology Pub Date : 2025-05-09 DOI: 10.1089/cmb.2024.0843
Martí Cortada Garcia, Adrià Diéguez Moscardó, Marta Casanellas
{"title":"Generating Heterogeneous Data on Gene Trees.","authors":"Martí Cortada Garcia, Adrià Diéguez Moscardó, Marta Casanellas","doi":"10.1089/cmb.2024.0843","DOIUrl":"https://doi.org/10.1089/cmb.2024.0843","url":null,"abstract":"<p><p>We introduce GenPhylo, a Python module that simulates nucleotide sequence data along a phylogeny avoiding the restriction of continuous-time Markov processes. GenPhylo uses directly a general Markov model and therefore naturally incorporates heterogeneity across lineages. We solve the challenge of generating transition matrices with a pre-given expected number of substitutions (the branch length information) by providing an algorithm that can be incorporated in other simulation software.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143972956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信