{"title":"gdGSE:一种通过离散基因表达值来评估途径富集的算法。","authors":"Jiangti Luo, Qiqi Lu, Mengjiao He, Xiaobo Zhang, Xiang Yang, Xiaosheng Wang","doi":"10.1016/j.csbj.2025.04.038","DOIUrl":null,"url":null,"abstract":"<p><p>We proposed gdGSE, a novel computational framework for gene set enrichment analysis. Unlike conventional methods that rely on continuous gene expression values, gdGSE employs discretized gene expression profiles to assess pathway activity. This approach effectively mitigates discrepancies caused by data distributions. This algorithm consists of two steps: (1) applying statistical thresholds binarizing gene expression matrix, and (2) converting the binarized gene expression matrix into a gene set enrichment matrix. Our results demonstrated that gdGSE could robustly extract biological insights from a diverse array of simulated and real bulk or single-cell gene expression datasets. Notably, gene set enrichment scores by gdGSE exhibited enhanced utility in downstream applications: (1) precise quantification of cancer stemness with significant prognostic relevance; (2) enhanced clustering performance in stratifying tumor subtypes with distinct prognoses; and (3) more accurate identification of cell types. Remarkably, the pathway activity scores by gdGSE showed > 90 % concordance with experimentally validated drug mechanisms in patients-derived xenografts and estrogen receptor-positive breast cancer cell lines. Our algorithm proposes that discretizing gene expression values provides an alternative method for evaluating pathway enrichment, applicable to both bulk and single-cell data analysis.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"1772-1783"},"PeriodicalIF":4.4000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12127574/pdf/","citationCount":"0","resultStr":"{\"title\":\"gdGSE: An algorithm to evaluate pathway enrichment by discretizing gene expression values.\",\"authors\":\"Jiangti Luo, Qiqi Lu, Mengjiao He, Xiaobo Zhang, Xiang Yang, Xiaosheng Wang\",\"doi\":\"10.1016/j.csbj.2025.04.038\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>We proposed gdGSE, a novel computational framework for gene set enrichment analysis. Unlike conventional methods that rely on continuous gene expression values, gdGSE employs discretized gene expression profiles to assess pathway activity. This approach effectively mitigates discrepancies caused by data distributions. This algorithm consists of two steps: (1) applying statistical thresholds binarizing gene expression matrix, and (2) converting the binarized gene expression matrix into a gene set enrichment matrix. Our results demonstrated that gdGSE could robustly extract biological insights from a diverse array of simulated and real bulk or single-cell gene expression datasets. Notably, gene set enrichment scores by gdGSE exhibited enhanced utility in downstream applications: (1) precise quantification of cancer stemness with significant prognostic relevance; (2) enhanced clustering performance in stratifying tumor subtypes with distinct prognoses; and (3) more accurate identification of cell types. Remarkably, the pathway activity scores by gdGSE showed > 90 % concordance with experimentally validated drug mechanisms in patients-derived xenografts and estrogen receptor-positive breast cancer cell lines. Our algorithm proposes that discretizing gene expression values provides an alternative method for evaluating pathway enrichment, applicable to both bulk and single-cell data analysis.</p>\",\"PeriodicalId\":10715,\"journal\":{\"name\":\"Computational and structural biotechnology journal\",\"volume\":\"27 \",\"pages\":\"1772-1783\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12127574/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational and structural biotechnology journal\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1016/j.csbj.2025.04.038\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational and structural biotechnology journal","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.csbj.2025.04.038","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
gdGSE: An algorithm to evaluate pathway enrichment by discretizing gene expression values.
We proposed gdGSE, a novel computational framework for gene set enrichment analysis. Unlike conventional methods that rely on continuous gene expression values, gdGSE employs discretized gene expression profiles to assess pathway activity. This approach effectively mitigates discrepancies caused by data distributions. This algorithm consists of two steps: (1) applying statistical thresholds binarizing gene expression matrix, and (2) converting the binarized gene expression matrix into a gene set enrichment matrix. Our results demonstrated that gdGSE could robustly extract biological insights from a diverse array of simulated and real bulk or single-cell gene expression datasets. Notably, gene set enrichment scores by gdGSE exhibited enhanced utility in downstream applications: (1) precise quantification of cancer stemness with significant prognostic relevance; (2) enhanced clustering performance in stratifying tumor subtypes with distinct prognoses; and (3) more accurate identification of cell types. Remarkably, the pathway activity scores by gdGSE showed > 90 % concordance with experimentally validated drug mechanisms in patients-derived xenografts and estrogen receptor-positive breast cancer cell lines. Our algorithm proposes that discretizing gene expression values provides an alternative method for evaluating pathway enrichment, applicable to both bulk and single-cell data analysis.
期刊介绍:
Computational and Structural Biotechnology Journal (CSBJ) is an online gold open access journal publishing research articles and reviews after full peer review. All articles are published, without barriers to access, immediately upon acceptance. The journal places a strong emphasis on functional and mechanistic understanding of how molecular components in a biological process work together through the application of computational methods. Structural data may provide such insights, but they are not a pre-requisite for publication in the journal. Specific areas of interest include, but are not limited to:
Structure and function of proteins, nucleic acids and other macromolecules
Structure and function of multi-component complexes
Protein folding, processing and degradation
Enzymology
Computational and structural studies of plant systems
Microbial Informatics
Genomics
Proteomics
Metabolomics
Algorithms and Hypothesis in Bioinformatics
Mathematical and Theoretical Biology
Computational Chemistry and Drug Discovery
Microscopy and Molecular Imaging
Nanotechnology
Systems and Synthetic Biology