{"title":"基于page的从单细胞到批量测序的迁移学习增强了败血症诊断的模型泛化。","authors":"Nana Jin, Chuanchuan Nan, Wanyang Li, Peijing Lin, Yu Xin, Jun Wang, Yuelong Chen, Yuanhao Wang, Kaijiang Yu, Changsong Wang, Chunbo Chen, Qingshan Geng, Lixin Cheng","doi":"10.1093/bib/bbae661","DOIUrl":null,"url":null,"abstract":"<p><p>Sepsis, caused by infections, sparks a dangerous bodily response. The transcriptional expression patterns of host responses aid in the diagnosis of sepsis, but the challenge lies in their limited generalization capabilities. To facilitate sepsis diagnosis, we present an updated version of single-cell Pair-wise Analysis of Gene Expression (scPAGE) using transfer learning method, scPAGE2, dedicated to data fusion between single-cell and bulk transcriptome. Compared to scPAGE, the upgrade to scPAGE2 featured ameliorated Differentially Expressed Gene Pairs (DEPs) for pretraining a model in single-cell transcriptome and retrained it using bulk transcriptome data to construct a sepsis diagnostic model, which effectively transferred cell-layer information from single-cell to bulk transcriptome. Seven datasets across three transcriptome platforms and fluorescence-activated cell sorting (FACS) were used for performance validation. The model involved four DEPs, showing robust performance across next-generation sequencing and microarray platforms, surpassing state-of-the-art models with an average AUROC of 0.947 and an average AUPRC of 0.987. Analysis of scRNA-seq data reveals higher cell proportions with JAM3-PIK3AP1 expression in sepsis monocytes, decreased ARG1-CCR7 in B and T cells. Elevated IRF6-HP in sepsis monocytes confirmed by both scRNA-seq and an independent cohort using FACS. Both the superior performance of the model and the in vitro validation of IRF6-HP in monocytes emphasize that scPAGE2 is effective and robust in the construction of sepsis diagnostic model. We additionally applied scPAGE2 to acute myeloid leukemia and demonstrated its superior classification performance. Overall, we provided a strategy to improve the generalizability of classification model that can be adapted to a broad range of clinical prediction scenarios.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PAGE-based transfer learning from single-cell to bulk sequencing enhances model generalization for sepsis diagnosis.\",\"authors\":\"Nana Jin, Chuanchuan Nan, Wanyang Li, Peijing Lin, Yu Xin, Jun Wang, Yuelong Chen, Yuanhao Wang, Kaijiang Yu, Changsong Wang, Chunbo Chen, Qingshan Geng, Lixin Cheng\",\"doi\":\"10.1093/bib/bbae661\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Sepsis, caused by infections, sparks a dangerous bodily response. The transcriptional expression patterns of host responses aid in the diagnosis of sepsis, but the challenge lies in their limited generalization capabilities. To facilitate sepsis diagnosis, we present an updated version of single-cell Pair-wise Analysis of Gene Expression (scPAGE) using transfer learning method, scPAGE2, dedicated to data fusion between single-cell and bulk transcriptome. Compared to scPAGE, the upgrade to scPAGE2 featured ameliorated Differentially Expressed Gene Pairs (DEPs) for pretraining a model in single-cell transcriptome and retrained it using bulk transcriptome data to construct a sepsis diagnostic model, which effectively transferred cell-layer information from single-cell to bulk transcriptome. Seven datasets across three transcriptome platforms and fluorescence-activated cell sorting (FACS) were used for performance validation. The model involved four DEPs, showing robust performance across next-generation sequencing and microarray platforms, surpassing state-of-the-art models with an average AUROC of 0.947 and an average AUPRC of 0.987. Analysis of scRNA-seq data reveals higher cell proportions with JAM3-PIK3AP1 expression in sepsis monocytes, decreased ARG1-CCR7 in B and T cells. Elevated IRF6-HP in sepsis monocytes confirmed by both scRNA-seq and an independent cohort using FACS. Both the superior performance of the model and the in vitro validation of IRF6-HP in monocytes emphasize that scPAGE2 is effective and robust in the construction of sepsis diagnostic model. We additionally applied scPAGE2 to acute myeloid leukemia and demonstrated its superior classification performance. Overall, we provided a strategy to improve the generalizability of classification model that can be adapted to a broad range of clinical prediction scenarios.</p>\",\"PeriodicalId\":9209,\"journal\":{\"name\":\"Briefings in bioinformatics\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2024-11-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Briefings in bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/bib/bbae661\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbae661","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
PAGE-based transfer learning from single-cell to bulk sequencing enhances model generalization for sepsis diagnosis.
Sepsis, caused by infections, sparks a dangerous bodily response. The transcriptional expression patterns of host responses aid in the diagnosis of sepsis, but the challenge lies in their limited generalization capabilities. To facilitate sepsis diagnosis, we present an updated version of single-cell Pair-wise Analysis of Gene Expression (scPAGE) using transfer learning method, scPAGE2, dedicated to data fusion between single-cell and bulk transcriptome. Compared to scPAGE, the upgrade to scPAGE2 featured ameliorated Differentially Expressed Gene Pairs (DEPs) for pretraining a model in single-cell transcriptome and retrained it using bulk transcriptome data to construct a sepsis diagnostic model, which effectively transferred cell-layer information from single-cell to bulk transcriptome. Seven datasets across three transcriptome platforms and fluorescence-activated cell sorting (FACS) were used for performance validation. The model involved four DEPs, showing robust performance across next-generation sequencing and microarray platforms, surpassing state-of-the-art models with an average AUROC of 0.947 and an average AUPRC of 0.987. Analysis of scRNA-seq data reveals higher cell proportions with JAM3-PIK3AP1 expression in sepsis monocytes, decreased ARG1-CCR7 in B and T cells. Elevated IRF6-HP in sepsis monocytes confirmed by both scRNA-seq and an independent cohort using FACS. Both the superior performance of the model and the in vitro validation of IRF6-HP in monocytes emphasize that scPAGE2 is effective and robust in the construction of sepsis diagnostic model. We additionally applied scPAGE2 to acute myeloid leukemia and demonstrated its superior classification performance. Overall, we provided a strategy to improve the generalizability of classification model that can be adapted to a broad range of clinical prediction scenarios.
期刊介绍:
Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data.
The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.