Bioinformatics advances最新文献_第4页

iSEEtree: interactive explorer for hierarchical data. 用于分层数据的交互式资源管理器。

IF 2.4

Bioinformatics advances Pub Date : 2025-05-06 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf107

Giulio Benedetti, Ely Seraidarian, Theotime Pralas, Akewak Jeba, Tuomas Borman, Leo Lahti

{"title":"iSEEtree: interactive explorer for hierarchical data.","authors":"Giulio Benedetti, Ely Seraidarian, Theotime Pralas, Akewak Jeba, Tuomas Borman, Leo Lahti","doi":"10.1093/bioadv/vbaf107","DOIUrl":"10.1093/bioadv/vbaf107","url":null,"abstract":"Motivation: Hierarchical data structures are prevalent across several research fields, as they represent an organized and efficient approach to study complex interconnected systems. Their significance is particularly evident in microbiome analysis, where microbial communities are classified at various taxonomic levels using phylogenetic trees. In light of this trend, the R/Bioconductor community has established a reproducible analytical framework for hierarchical data, which relies on the generic and optimized TreeSummarizedExperiment data container. However, this framework requires basic programming skills.Results: To reduce the entry requirements, we developed iSEEtree, an R package, which provides a visual interface for the analysis and exploration of TreeSummarizedExperiment objects, thereby expanding the interactive graphics capabilities of related work to hierarchical structures. This way, users can interactively explore several aspects of their data without the need for an extensive knowledge of R programming. We describe how iSEEtree enables the exploration of hierarchical multi-table data and demonstrate its functionality with applications to microbiome analysis.Availability and implementation: iSEEtree was implemented in the R programming language and is available on Bioconductor at https://bioconductor.org/packages/iSEEtree under an Artistic 2.0 license.","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf107"},"PeriodicalIF":2.4,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12095132/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144129664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

cfTools: an R/Bioconductor package for deconvolving cell-free DNA via methylation analysis. cfTools：一个R/Bioconductor包，用于通过甲基化分析反卷积无细胞DNA。

IF 2.4

Bioinformatics advances Pub Date : 2025-05-06 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf108

Ran Hu, Shuo Li, Mary L Stackpole, Qingjiao Li, Xianghong Jasmine Zhou, Wenyuan Li

{"title":"cfTools: an R/Bioconductor package for deconvolving cell-free DNA via methylation analysis.","authors":"Ran Hu, Shuo Li, Mary L Stackpole, Qingjiao Li, Xianghong Jasmine Zhou, Wenyuan Li","doi":"10.1093/bioadv/vbaf108","DOIUrl":"10.1093/bioadv/vbaf108","url":null,"abstract":"Motivation: Cell-free DNA (cfDNA) released by dying cells from damaged or diseased tissues can lead to elevated tissue-specific DNA, which is traceable and quantifiable through unique DNA methylation patterns. Therefore, tracing cfDNA origins by analyzing its methylation profiles holds great potential for detecting and monitoring a range of diseases, including cancers. However, deconvolving tissue-specific cfDNA remains challenging for broader applications and research due to the scarcity of specialized, user-friendly bioinformatics tools.Results: To address this, we developed cfTools, an R package that streamlines cfDNA tissue-of-origin analysis for disease detection and monitoring. Integrating advanced cfDNA tissue deconvolution algorithms with R/Bioconductor compatibility, cfTools offers data preparation and analysis functions with flexible parameters for user-friendliness. By identifying abnormal cfDNA compositions, cfTools can infer the presence of underlying pathological conditions, including but not limited to cancer. It simplifies bioinformatics tasks and enables users without advanced expertise to easily derive biologically interpretable insights from standard preprocessed sequencing data, thus increasing its accessibility and broadening its application in cfDNA-based disease studies.Availability and implementation: cfTools and its supplementary package cfToolsData are freely available at Bioconductor: https://bioconductor.org/packages/release/bioc/html/cfTools.html and https://bioconductor.org/packages/release/data/experiment/html/cfToolsData.html. The development version of cfTools is maintained on GitHub: https://github.com/jasminezhoulab/cfTools.","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf108"},"PeriodicalIF":2.4,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12124914/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144200913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AnnSQL: a Python SQL-based package for fast large-scale single-cell genomics analysis using minimal computational resources. AnnSQL：一个基于Python sql的包，用于使用最少的计算资源进行快速大规模单细胞基因组分析。

IF 2.4

Bioinformatics advances Pub Date : 2025-05-05 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf105

Kenny Pavan, Arpiar Saunders

{"title":"AnnSQL: a Python SQL-based package for fast large-scale single-cell genomics analysis using minimal computational resources.","authors":"Kenny Pavan, Arpiar Saunders","doi":"10.1093/bioadv/vbaf105","DOIUrl":"10.1093/bioadv/vbaf105","url":null,"abstract":"Summary: As single-cell genomics technologies continue to accelerate biological discovery, software tools that use elegant syntax and minimal computational resources to analyze atlas-scale datasets are increasingly needed. Here, we introduce AnnSQL, a Python package that constructs an AnnData-inspired database using the in-process DuckDb engine, enabling orders-of-magnitude performance enhancements for parsing single-cell genomics datasets with the ease of SQL. We highlight AnnSQL functionality and demonstrate transformative runtime improvements by comparing AnnData or AnnSQL operations on a 4.4 million cell single-nucleus RNA-seq dataset: AnnSQL-based operations were executed in minutes on a laptop for which equivalent operations in AnnData or Seurat largely failed (or were ∼700× slower) on a high-performance computing cluster. AnnSQL lowers computational barriers for large-scale single-cell/nucleus RNA-seq analysis on a personal computer, while demonstrating a promising computational infrastructure extendable for complete single-cell workflows across various genome-wide measurements.Availability and implementation: AnnSQL is a pip installable package that can be found at https://github.com/ArpiarSaundersLab/annsql along with documentation at https://docs.annsql.com.","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf105"},"PeriodicalIF":2.4,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12098940/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

VaxGO: an interactive web tool for systems vaccinology data analysis. VaxGO：用于系统疫苗学数据分析的交互式网络工具。

IF 2.4

Bioinformatics advances Pub Date : 2025-05-02 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf101

Wasim Aluísio Prates-Syed, Aline Aparecida Lima, Nelson Cortes, Evelyn Carvalho, Jaqueline Dinis Queiroz Silva, Bárbara Hamaguchi, Ricardo Durães-Carvalho, Otávio Cabral-Marques, Thomas Hagan, José E Krieger, Gustavo Cabral-Miranda

{"title":"VaxGO: an interactive web tool for systems vaccinology data analysis.","authors":"Wasim Aluísio Prates-Syed, Aline Aparecida Lima, Nelson Cortes, Evelyn Carvalho, Jaqueline Dinis Queiroz Silva, Bárbara Hamaguchi, Ricardo Durães-Carvalho, Otávio Cabral-Marques, Thomas Hagan, José E Krieger, Gustavo Cabral-Miranda","doi":"10.1093/bioadv/vbaf101","DOIUrl":"10.1093/bioadv/vbaf101","url":null,"abstract":"Motivation: RNA sequencing is crucial for investigating transcriptional patterns in immunology and vaccine research. However, the analysis of RNA sequencing data often requires programming skills, which can limit accessibility for researchers lacking such expertise.Results: We present VaxGO, an intuitive web-based tool designed to facilitate the analysis of differentially expressed genes in the context of immune processes and cells during vaccination. This tool integrates data from Gene Ontology, CellMarker 2.0, the MSigDB Vax collection, and other key studies, including transcriptional atlases of vaccines against COVID-19 and other diseases. VaxGO is an interactive, web-based tool, offering a user-friendly platform for exploring immune responses and vaccine efficacy without programming expertise.Availability and implementation: The VaxGO tool is available at https://github.com/wapsyed/VaxGO.","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf101"},"PeriodicalIF":2.4,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12179387/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144478034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GenePioneer: a comprehensive Python package for identification of essential genes and modules in cancer. GenePioneer：一个全面的Python包，用于识别癌症中的基本基因和模块。

IF 2.4

Bioinformatics advances Pub Date : 2025-04-29 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf094

Amirhossein Haerianardakani, Golnaz Taheri

引用次数: 0

Assessing accuracy and specificity of faecal source library for microbial source-tracking, using SourceTracker as case study. 以SourceTracker为例，评估粪便源库用于微生物源追踪的准确性和特异性。

IF 2.4

Bioinformatics advances Pub Date : 2025-04-29 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf103

Timothy J Y Lim, Yussi M Palacios Delgado, Anna Lintern, David T McCarthy, Rebekah Henry

{"title":"Assessing accuracy and specificity of faecal source library for microbial source-tracking, using SourceTracker as case study.","authors":"Timothy J Y Lim, Yussi M Palacios Delgado, Anna Lintern, David T McCarthy, Rebekah Henry","doi":"10.1093/bioadv/vbaf103","DOIUrl":"10.1093/bioadv/vbaf103","url":null,"abstract":"Motivation: Understanding the quality of the source library prior to undertaking library-dependent microbial source-tracking (MST) is an essential, but often overlooked, primary analysis step.Results: We propose an assessment approach to validate the quality of amplicon-derived faecal source libraries. This approach was demonstrated on a faecal source library consisting of 16S rRNA paired-end amplicon sequences, obtained from various animal types in Victoria, Australia. First, a leave-one-out (LOO) analysis was performed to assess the accuracy of source category groupings by identifying the number of samples incorrectly assigned to a different source category (i.e. animal type). Following a quality control procedure to decide retaining/removing/grouping incorrectly assigned samples, we then assessed if the sample sizes for each source type were sufficient to properly characterize the source fingerprints. Results from LOO demonstrated 15.5% of samples were incorrectly assigned, with high error rates in birds and wallabies within our source library. Increasing the sample size improved source identification accuracy. However, accuracy eventually plateaued in a source-specific manner. Importantly, this highlights the importance of conducting thorough assessments to understand the quality and limitations of the source library prior to library-dependent MST applications.Availability and implementation: QIIME2 is available via https://qiime2.org/; SourceTracker v2.0.1 is available via https://github.com/caporaso-lab/sourcetracker2; Pipeline for LOO is available via https://github.com/MonashOWL/Bioinformatics-IlluminaMGI/tree/main/16S/LOO; Pipeline for sample size assessment is available via https://github.com/MonashOWL/Bioinformatics-IlluminaMGI/tree/main/16S/Source%20variability.","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf103"},"PeriodicalIF":2.4,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12092083/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144112782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

bamSliceR: a Bioconductor package for rapid, cross-cohort variant and allelic bias analysis. bamSliceR：用于快速、跨队列变异和等位基因偏倚分析的Bioconductor软件包。

IF 2.4

Bioinformatics advances Pub Date : 2025-04-28 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf098

Yizhou Peter Huang, Lauren Harmon, Eve Deering-Gardner, Xiaotu Ma, Josiah Harsh, Zhaoyu Xue, Hong Wen, Marcel Ramos, Sean Davis, Timothy J Triche

{"title":"bamSliceR: a Bioconductor package for rapid, cross-cohort variant and allelic bias analysis.","authors":"Yizhou Peter Huang, Lauren Harmon, Eve Deering-Gardner, Xiaotu Ma, Josiah Harsh, Zhaoyu Xue, Hong Wen, Marcel Ramos, Sean Davis, Timothy J Triche","doi":"10.1093/bioadv/vbaf098","DOIUrl":"10.1093/bioadv/vbaf098","url":null,"abstract":"Motivation: The National Cancer Institute Genomic Data Commons (GDC) provides controlled access to sequencing data from thousands of subjects, enabling large-scale study of impactful genetic alterations such as simple and complex germline and structural variants. However, efficient analysis requires significant computational resources and expertise, especially when calling variants from raw sequence reads. To solve these problems, we developed bamSliceR, a R/bioconductor package that builds upon the GenomicDataCommons package to extract aligned sequence reads from cross-GDC meta-cohorts, followed by targeted analysis of variants and effects (including transcript-aware variant annotation from transcriptome-aligned GDC RNA data).Results: Here, we demonstrate population-scale genomic and transcriptomic analyses with minimal compute burden using bamSliceR, identifying recurrent, clinically relevant sequence, and structural variants in the TARGET acute myeloid leukemia (AML) and BEAT-AML cohorts. We then validate results in the (non-GDC) Leucegene cohort, demonstrating how the bamSliceR pipeline can be seamlessly applied to replicate findings in non-GDC cohorts. These variants directly yield clinically impactful and biologically testable hypotheses for mechanistic investigation.Availability and implementation: bamSliceR has been submitted to the Bioconductor project, where it is presently under review, and is available on GitHub at https://github.com/trichelab/bamSliceR.","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf098"},"PeriodicalIF":2.4,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12089696/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144112774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Grape-Pi: graph-based neural networks for enhanced protein identification in proteomics pipelines. 在蛋白质组学管道中增强蛋白质鉴定的基于图的神经网络。

IF 2.4

Bioinformatics advances Pub Date : 2025-04-26 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf095

Chunhui Gu, Seyyed Mahmood Ghasemi, Yining Cai, Johannes F Fahrmann, James P Long, Hiroyuki Katayama, Chong Wu, Jody Vykoukal, Jennifer B Dennison, Samir Hanash, Kim-Anh Do, Ehsan Irajizad

{"title":"Grape-Pi: graph-based neural networks for enhanced protein identification in proteomics pipelines.","authors":"Chunhui Gu, Seyyed Mahmood Ghasemi, Yining Cai, Johannes F Fahrmann, James P Long, Hiroyuki Katayama, Chong Wu, Jody Vykoukal, Jennifer B Dennison, Samir Hanash, Kim-Anh Do, Ehsan Irajizad","doi":"10.1093/bioadv/vbaf095","DOIUrl":"10.1093/bioadv/vbaf095","url":null,"abstract":"Motivation: Protein identification via mass spectrometry (MS) is the primary method for untargeted protein detection. However, the identification process is challenging due to data complexity and the need to control false discovery rates (FDR) of protein identification. To address these challenges, we developed a graph neural network (GNN)-based model, Graph Neural Network using Protein-Protein Interaction for Enhancing Protein Identification (Grape-Pi), which is applicable to all proteomics pipelines. This model leverages protein-protein interaction (PPI) data and employs two types of message-passing layers to integrate evidence from both the target protein and its interactors, thereby improving identification accuracy.Results: Grape-Pi achieved significant improvements in area under receiver-operating characteristic curve (AUC) in differentiating present and absent proteins: 18% and 7% in two yeast samples and 9% in gastric samples over traditional methods in the test dataset. Additionally, proteins identified via Grape-Pi in gastric samples demonstrated a high correlation with mRNA data and identified gastric cancer proteins, like MAP4K4, missed by conventional methods.Availability and implementation: Grape-Pi is freely available at https://zenodo.org/records/11310518 and https://github.com/FDUguchunhui/GrapePi.","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf095"},"PeriodicalIF":2.4,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12096076/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144129661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Identifying core genes in keloid and investigating immune infiltration and pan-cancer associations using eQTL and machine learning. 利用eQTL和机器学习识别瘢痕疙瘩的核心基因，研究免疫浸润和泛癌症关联。

IF 2.4

Bioinformatics advances Pub Date : 2025-04-26 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf076

Xiaoyuan He, Yang Song

{"title":"Identifying core genes in keloid and investigating immune infiltration and pan-cancer associations using eQTL and machine learning.","authors":"Xiaoyuan He, Yang Song","doi":"10.1093/bioadv/vbaf076","DOIUrl":"10.1093/bioadv/vbaf076","url":null,"abstract":"Motivation: Keloid is a fibroproliferative skin disorder characterized by excessive fibroblast proliferation and abnormal extracellular matrix accumulation. It manifests as continuous growth, redness, itching, and pain, with a high recurrence rate. The pathogenesis of keloid is complex, with genetics and gene mutations increasingly recognized as critical risk factors. This condition exhibits familial predisposition and clustering, with individuals of darker skin tones at greater risk. To elucidate the genetic factors underlying keloid development, this study integrates bioinformatics and Mendelian randomization (MR) approaches to identify core genes associated with keloid, providing novel insights into its pathogenesis, treatment, and prognosis.Results: Bioinformatics and Mendelian randomization analyses identified two intersecting genes, CCND2 and KLF4, as core genes associated with keloid. MR analysis revealed that CCND2 is causally associated with keloid [inverse variance weighted (IVW) odds ratio (OR): 1.410; 95% confidence interval (CI): 1.001-1.985, P = .049], indicating it is a risk factor, while KLF4 is inversely associated with keloid (IVW OR: 0.492; 95% CI: 0.290-0.835, P = .009). Both intersecting genes exhibit a causal relationship with keloid, identifying them as two core genes. Specifically, CCND2 is recognized as a risk factor for keloid, while KLF4 functions as a protective factor against keloid formation. Validation analyses were conducted on these two core genes, revealing significant differences in KLF4 expression within the validation cohort.Availability and implementation: Firstly, bioinformatics analysis identified differentially expressed genes (DEGs) from the keloid GEO datasets. Secondly, MR was applied to eQTL and keloid GWAS datasets to identify candidate genes. Overlapping genes were derived by intersecting DEGs with MR candidate genes. Causal relationships between overlapping genes and keloids were analyzed using five MR methods, identifying core genes significantly associated with keloid pathogenesis. Cochran's Q test and MR-Egger intercept analysis evaluated heterogeneity and pleiotropy in MR results. GO, KEGG, and GSEA enrichment analyses were conducted to explore core gene functions. Finally, validation and TCGA pan-cancer analyses were conducted on the core genes.","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf076"},"PeriodicalIF":2.4,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12145169/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PPIFold: a tool for analysis of protein-protein interaction from AlphaPullDown. PPIFold：来自AlphaPullDown的蛋白质相互作用分析工具。

IF 2.4

Bioinformatics advances Pub Date : 2025-04-24 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf090

Quentin Rouger, Emmanuel Giudice, Damien F Meyer, Kévin Macé

引用次数: 0