Briefings in bioinformatics最新文献_第5页

Benchmarking large language models for genomic knowledge with GeneTuring. 用GeneTuring对基因组知识的大型语言模型进行基准测试。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf492

Xinyi Shang, Xu Liao, Zhicheng Ji, Wenpin Hou

引用次数: 0

DA-HGL: a domain-augmented heterogeneous graph learning framework for protein function prediction. DA-HGL：用于蛋白质功能预测的域增强异构图学习框架。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf511

Sai Hu, Wei Zhang, Bihai Zhao

{"title":"DA-HGL: a domain-augmented heterogeneous graph learning framework for protein function prediction.","authors":"Sai Hu, Wei Zhang, Bihai Zhao","doi":"10.1093/bib/bbaf511","DOIUrl":"10.1093/bib/bbaf511","url":null,"abstract":"Accurate protein function prediction is critical for deciphering disease mechanisms and advancing precision medicine, yet remains challenging for proteins with sparse annotations. Traditional methods struggle with annotation sparsity and fail to integrate multimodal data holistically. We propose DA-HGL, a heterogeneous graph learning framework that integrates protein sequences, domain architectures, and Gene Ontology (GO) hierarchies through a multilayered graph and non-negative matrix factorization with dual biological constraints. DA-HGL uniquely models domain-function coherence, GO semantic consistency, and topological congruence. Evaluated on yeast and human proteomes, DA-HGL achieves Fmax gains of 9.0% (yeast CC) and 17.2% (human BP) over state-of-the-art methods. By dynamically learning domain-context associations and resolving annotation sparsity, DA-HGL excels in cold-start scenarios and disease-specific predictions (e.g. Parkinson's \"ubiquitin-dependent catabolism\"). This framework offers a robust tool for accelerating functional genomics and precision medicine. Code/data: https://github.com/husaiccsu/DA-HGL.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12476837/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145184520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TELLBASE: a novel tool of TELL-seq barcode-assisted scaffold assembler for bacterial genomes. TELLBASE：一种新型的细菌基因组TELL-seq条形码辅助支架组装工具。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf504

Yutong Li, Tianlong Kuang, Tao Xu, Hanxiao Du, Yi Zhang, Yu Qian, Yiwen Chen, Zhenxian Xiao, Chen Chen, Jing Wu, Wen-Hong Zhang, Chenqi Lu, Ning Jiang

{"title":"TELLBASE: a novel tool of TELL-seq barcode-assisted scaffold assembler for bacterial genomes.","authors":"Yutong Li, Tianlong Kuang, Tao Xu, Hanxiao Du, Yi Zhang, Yu Qian, Yiwen Chen, Zhenxian Xiao, Chen Chen, Jing Wu, Wen-Hong Zhang, Chenqi Lu, Ning Jiang","doi":"10.1093/bib/bbaf504","DOIUrl":"10.1093/bib/bbaf504","url":null,"abstract":"Transposase enzyme linked long-read sequencing (TELL-seq) technology generates barcode-linked reads, facilitating whole-genome sequencing (WGS), and complete assembly with improved accuracy and reduced costs. Unlike mate-pair sequencing technology, TELL-seq employs a near-full-sequence tagging strategy that allows more efficient capture of comprehensive genomic information. However, assembly algorithms and software capable of fully leveraging the characteristics of TELL-seq technology to effectively assemble genomic sequences at the megabase-scale are lacking, particularly for bacteria and their plasmids. In this study, we present TELL-seq barcode-assisted scaffold assembler (TELLBASE), a de novo genome assembler designed specifically for assembling bacterial genomes using TELL-seq-derived linked reads. In assembly tests involving bacteria such as Acinetobacter baumannii, Klebsiella pneumoniae, Mycobacterium tuberculosis, and Staphylococcus aureus, TELLBASE exhibited exceptional efficacy in producing chromosome-level bacterial genomic sequences and successful identification of plasmids present in the sequenced strains. Comparative analysis revealed that TELLBASE significantly outperforms existing assemblers tailored for TELL-seq-derived linked reads, such as TuringAssembler and Ariadne, in terms of the completeness and accuracy of the assembled genomes. Therefore, TELLBASE shows promising potential for refining draft bacterial genomes and further applications in related fields. The package for TELLBASE is freely available on GitHub (https://github.com/sosie1/TELLBASE).","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12476840/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145184570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MCAMEF-BERT: an efficient deep learning method for RNA N7-methylguanosine site prediction via multi-branch feature integration. MCAMEF-BERT：基于多分支特征集成的RNA n7 -甲基鸟苷位点预测的高效深度学习方法。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf447

Junlei Yu, Wenjia Gao, Siqi Chen, Ronglin Lu, Jianbo Qiao, Junru Jin, Leyi Wei, Hua Shi, Zilong Zhang, Feifei Cui, Xinbo Jiang, Zhongmin Yan

{"title":"MCAMEF-BERT: an efficient deep learning method for RNA N7-methylguanosine site prediction via multi-branch feature integration.","authors":"Junlei Yu, Wenjia Gao, Siqi Chen, Ronglin Lu, Jianbo Qiao, Junru Jin, Leyi Wei, Hua Shi, Zilong Zhang, Feifei Cui, Xinbo Jiang, Zhongmin Yan","doi":"10.1093/bib/bbaf447","DOIUrl":"10.1093/bib/bbaf447","url":null,"abstract":"Accurate identification of N7-methylguanosine (m7G) modification sites plays a critical role in uncovering the regulatory mechanisms of various biological processes, including human development, tumor initiation, and progression. However, existing prediction methods still suffer from limited representational power, redundant feature fusion, insufficient utilization of biological prior knowledge, and poor interpretability. In this study, we propose a novel deep learning model named MCAMEF-BERT. This model adopts a parallel architecture that integrates both a DNABERT-2-based pretrained model branch and multiple traditional feature encoding branches, enabling comprehensive multi-perspective sequence feature extraction. To address the redundancy issue in feature fusion, we introduce a multi-channel attention module. Our model demonstrates superior accuracy and effectiveness on datasets from m7GHub, outperforming other state-of-the-art classifiers. Furthermore, we validate the interpretability of MCAMEF-BERT through in silico saturation mutagenesis experiments, and confirm its robustness in motif recognition. Moreover, its generalization capability is validated across diverse RNA modification site prediction tasks.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12400811/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144943426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GCNMF-SDA: predicting snoRNA-disease associations based on graph convolution and non-negative matrix factorization. GCNMF-SDA：基于图卷积和非负矩阵分解预测snorna -疾病关联。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf453

Yaowu Zhang, Xiu Jin, Xiaodan Zhang

{"title":"GCNMF-SDA: predicting snoRNA-disease associations based on graph convolution and non-negative matrix factorization.","authors":"Yaowu Zhang, Xiu Jin, Xiaodan Zhang","doi":"10.1093/bib/bbaf453","DOIUrl":"10.1093/bib/bbaf453","url":null,"abstract":"Small nucleolar RNAs (snoRNAs) play crucial roles in a wide range of biological processes, and studying their association with diseases can enhance our understanding of disease pathogenesis. Nevertheless, current knowledge of these associations is limited traditional biological experiments are both costly and time-consuming. Consequently, developing efficient computational methods is essential for predicting potential snoRNA-disease associations. We propose a novel prediction method based on non-negative matrix factorization and graph convolution for predicting snoRNA-disease associations (GCNMF-SDA). First, five different types of similarity information from snoRNA and disease entities are introduced to fully mine and refine the feature information. Then the snoRNA and disease similarity networks are integrated using nonlinearity approach Similarity Network Fusion (SNF), while the weighted K nearest known neighbors (WKNKN) algorithm is applied to optimize the snoRNA-disease association matrix. Following this, the graph convolution module and the non-negative matrix factorization module extract disease features and snoRNA features, respectively. After extracting these features, they are combined into a composite feature vector for each snoRNA-disease pair. Finally, the composite feature vectors along with their corresponding labels, are input into a multilayer perceptron for training. Our experiments, conducted using a rigorous five-fold cross-validation approach, reveal that the GCNMF-SDA model achieves an impressive area under the receiver operating characteristic curve (AUC-ROC) of 0.9659 and an area under the precision-recall curve (AUC-PR) of 0.9522. Furthermore, most of the novel associations identified by GCNMF-SDA were validated through case studies, underscoring the method's reliability in predicting potential relationships between snoRNAs and diseases.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12409419/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144991122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

stImage: a versatile framework for optimizing spatial transcriptomic analysis through customizable deep histology and location informed integration. stImage：通过可定制的深度组织学和位置信息集成优化空间转录组分析的通用框架。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf429

Yu Wang, Haichun Yang, Ruining Deng, Yuankai Huo, Qi Liu, Yu Shyr, Shilin Zhao

{"title":"stImage: a versatile framework for optimizing spatial transcriptomic analysis through customizable deep histology and location informed integration.","authors":"Yu Wang, Haichun Yang, Ruining Deng, Yuankai Huo, Qi Liu, Yu Shyr, Shilin Zhao","doi":"10.1093/bib/bbaf429","DOIUrl":"10.1093/bib/bbaf429","url":null,"abstract":"Spatial transcriptomics (ST) integrates gene expression data with the spatial organization of cells and their associated histology, offering unprecedented insights into tissue biology. While existing methods incorporate either location-based or histology-informed information, none fully synergize gene expression, histological features, and precise spatial coordinates within a unified framework. Moreover, these methods often exhibit inconsistent performance across diverse datasets and conditions. Here, we introduce stImage, an open-source R package that provides a comprehensive and flexible solution for ST analysis. By generating deep learning-derived histology features and offering 54 integrative strategies, stImage seamlessly combines transcriptional profiles, histology images, and spatial information. We demonstrate stImage's effectiveness across multiple datasets, underscoring its ability to guide users toward the most suitable integration strategy using diagnostic graph. Our results highlight how stImage can optimize ST, consistently improving biological insights and advancing our understanding of tissue architecture. stImage is freely available at https://github.com/YuWang-VUMC/stImage.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12409783/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144991322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Graph-based deep learning for integrating single-cell and bulk transcriptomic data to identify clinical cancer subtypes. 基于图的深度学习，整合单细胞和大量转录组数据，以识别临床癌症亚型。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf467

Yixin Liu, Dandan Zhang, Tianyu Liu, Ao Wang, Guohua Wang, Yuming Zhao

{"title":"Graph-based deep learning for integrating single-cell and bulk transcriptomic data to identify clinical cancer subtypes.","authors":"Yixin Liu, Dandan Zhang, Tianyu Liu, Ao Wang, Guohua Wang, Yuming Zhao","doi":"10.1093/bib/bbaf467","DOIUrl":"10.1093/bib/bbaf467","url":null,"abstract":"The integration of single-cell RNA sequencing (scRNA-seq) and bulk transcriptomic data has become essential for deciphering the complex heterogeneity of cancer and identifying clinical cancer subtypes. However, the inherent challenges posed by the high dimensionality, sparsity, and noise characteristics of scRNA-seq data have significantly hindered its widespread clinical translation. To address these limitations, we introduce single-cell and bulk transcriptomic graph deep learning, a graph-based deep learning method that synergistically integrates scRNA-seq and bulk transcriptomic data to precisely identify cancer subtypes and predict clinical outcomes. scBGDL constructs sample-specific gene graphs modeling complex gene-gene interactions and cellular relationships. The architecture employs Graph Attention Networks for feature aggregation, MinCutPool layers for dimensionality reduction, and Transformer modules to capture high-order biological dependencies. Independently validated in each of 16 distinct The Cancer Genome Atlas cancer types, scBGDL significantly outperformed existing methods in prognostic accuracy (mean C-index: 0.7060 versus 0.6709 max competitor), demonstrating robustness and generalizability to diverse transcriptional architectures. To demonstrate clinical versatility, we further evaluated scBGDL in three therapeutic contexts using multicenter cohorts: lung adenocarcinoma survival prediction (n = 1099), epithelial ovarian cancer platinum-based chemotherapy response (n = 762), skin cutaneous melanoma immunotherapy outcome (n = 305). scBGDL consistently delivered robust risk stratification (log-rank P < 0.05 across cohorts), identified key driver edges, and uncovered clinically relevant biological interpretations. By enabling multimodal data integration and interpretable biological insights, scBGDL advances precision oncology for prognosis prediction, therapy optimization, and biomarker discovery. The source code for scBGDL model is available online (https://github.com/NEFLab/scBGDL).","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12423395/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145085109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A comprehensive comparison on clustering methods for multi-slice spatially resolved transcriptomics data analysis. 多层空间分辨转录组学数据分析聚类方法的综合比较。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf471

Caiwei Xiong, Shuai Huang, Muqing Zhou, Yiyan Zhang, Wenrong Wu, Xihao Li, Huaxiu Yao, Jiawen Chen, Yun Li

{"title":"A comprehensive comparison on clustering methods for multi-slice spatially resolved transcriptomics data analysis.","authors":"Caiwei Xiong, Shuai Huang, Muqing Zhou, Yiyan Zhang, Wenrong Wu, Xihao Li, Huaxiu Yao, Jiawen Chen, Yun Li","doi":"10.1093/bib/bbaf471","DOIUrl":"10.1093/bib/bbaf471","url":null,"abstract":"Spatial transcriptomics (ST) data, by providing spatial information, enable simultaneous analysis of gene expression distributions and their spatial patterns within tissue. Clustering or spatial domain detection represents an essential methodology for ST data, facilitating the exploration of spatial organizations with shared gene expression or histological characteristics. Traditionally, clustering algorithms for ST have focused on individual tissue sections. However, the emergence of numerous contiguous tissue sections derived from the same or similar tissue specimens within or across individuals has led to the development of multi-slice clustering methods. In this study, we assess seven single-slice and four multi-slice clustering methods on two simulated datasets and four real datasets. Additionally, we investigate the effectiveness of preprocessing techniques, including spatial coordinate alignment (e.g. PASTE) and gene expression batch effect removal (e.g. Harmony), on clustering performance. Our study provides a comprehensive comparison of clustering methods for multi-slice ST data, serving as a practical guide for method selection in various scenarios.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12449087/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145085139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Artificial intelligence for comprehensive DNA methylation analysis: overview, challenges, and future directions. 综合DNA甲基化分析的人工智能：概述、挑战和未来方向。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf468

Aymane Aghziel, Mohamed Adnane Mahraz, Hamid Tairi, Noura Aherrahrou

引用次数: 0

IMATAC imputes single-cell ATAC-seq data by deep hierarchical network with denoising autoencoder. IMATAC采用带去噪自编码器的深度分层网络对单细胞ATAC-seq数据进行输入。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf515

Yao Li, Hongqiang Lyu, Kexin Li, Yuan Liu, Xinman Zhang, Ze Liu, Pengcheng Jing, Peng Han

{"title":"IMATAC imputes single-cell ATAC-seq data by deep hierarchical network with denoising autoencoder.","authors":"Yao Li, Hongqiang Lyu, Kexin Li, Yuan Liu, Xinman Zhang, Ze Liu, Pengcheng Jing, Peng Han","doi":"10.1093/bib/bbaf515","DOIUrl":"10.1093/bib/bbaf515","url":null,"abstract":"Single-cell ATAC-seq (scATAC-seq) technology allows the interrogation of chromatin accessibility of individual cells. Dropout events occur while the sequencing data signals at some bona fide chromatin sites of individuals are not captured, and the curse of these dropouts in scATAC-seq data inevitably hinders downstream analysis. It remains a challenge to impute scATAC-seq data due to its high dimensionality, sparsity, and near-binarization properties. Herein, we propose IMATAC, a deep hierarchical network with denoising autoencoder for imputing scATAC-seq data in the form of peak by cell. The network embeds scATAC-seq data into a latent space by a deep hierarchical architecture at two different levels, including bottom level for local details and top level for global information, that helps to characterize the high-dimensional sparse scATAC-seq data. Besides, it is encouraged to learn to reconstruct the original scATAC-seq data from an artificially corrupted version through a denoising autoencoder, so as to acquire an ability to recover the missing values primarily relying on the cells under the same population with the help of a parallel multi-classifier. Using simulated and experimental data, the performance of IMATAC is demonstrated by a comparative analysis with the other competing methods. The results suggest that our method can achieve lower imputation errors, and benefit the downstream analysis, including heterogeneous clustering, differential analysis, and regulatory element discovery. Besides, the contributions of several important network modules in our IMATAC are investigated, and how well it can separate the dropout zeros from biological zeros are discussed.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12478030/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145184459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0