Zhonghao Li, Shengsong Chen, Nan Gao, Jie Chen, Ying Qin, Guoqiang Zhang
{"title":"Identification of key genes and development of an identifying machine learning model for sepsis.","authors":"Zhonghao Li, Shengsong Chen, Nan Gao, Jie Chen, Ying Qin, Guoqiang Zhang","doi":"10.1007/s00011-025-02068-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective and design: </strong>This study aims to identify key genes of sepsis and construct a model for sepsis identification through integrated multi-organ single-cell RNA sequencing (scRNA-seq) and machine learning.</p><p><strong>Material or subjects: </strong>Datasets downloaded from the Gene Expression Omnibus (GSE207363, GSE207651, GSE185263, GSE69063 and GSE134347) were used.</p><p><strong>Methods: </strong>ScRNA-seq data extracted from heart (GSE207363) and lung tissues (GSE207651) of septic mice were processed and analyzed using the Seurat package in R. Key genes were identified as present in both heart and lung tissues, resulting from the overlap of three analyses along with differential expression analyses. We then used support vector machine recursive feature elimination to construct a model for sepsis identification based on these key genes. The GSE185263 dataset was used for training, while GSE69063 and GSE134347 were used for testing. The accuracy of the model in identifying of sepsis was validated by analyzing the area under the receiver operating characteristic curve (AUROC) using the test datasets.</p><p><strong>Results: </strong>Thirteen genes were initially identified as key genes, and after translation to their human homologs, ten genes remained. The optimal SVM-RFE model incorporated eight of these genes (CAMP, CD74, HLA-DQA1, HLA-DQB1, HLA-DMA, HLA-DRB5, and LYZ). In the two test datasets, the AUROC value for the accuracy of the model in identifying of sepsis was 0.904 and 0.924, respectively.</p><p><strong>Conclusions: </strong>We have identified several key genes and developed a machine learning model for sepsis identification. Further studies are needed to validate our findings.</p>","PeriodicalId":13550,"journal":{"name":"Inflammation Research","volume":"74 1","pages":"100"},"PeriodicalIF":5.4000,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Inflammation Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00011-025-02068-7","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective and design: This study aims to identify key genes of sepsis and construct a model for sepsis identification through integrated multi-organ single-cell RNA sequencing (scRNA-seq) and machine learning.
Material or subjects: Datasets downloaded from the Gene Expression Omnibus (GSE207363, GSE207651, GSE185263, GSE69063 and GSE134347) were used.
Methods: ScRNA-seq data extracted from heart (GSE207363) and lung tissues (GSE207651) of septic mice were processed and analyzed using the Seurat package in R. Key genes were identified as present in both heart and lung tissues, resulting from the overlap of three analyses along with differential expression analyses. We then used support vector machine recursive feature elimination to construct a model for sepsis identification based on these key genes. The GSE185263 dataset was used for training, while GSE69063 and GSE134347 were used for testing. The accuracy of the model in identifying of sepsis was validated by analyzing the area under the receiver operating characteristic curve (AUROC) using the test datasets.
Results: Thirteen genes were initially identified as key genes, and after translation to their human homologs, ten genes remained. The optimal SVM-RFE model incorporated eight of these genes (CAMP, CD74, HLA-DQA1, HLA-DQB1, HLA-DMA, HLA-DRB5, and LYZ). In the two test datasets, the AUROC value for the accuracy of the model in identifying of sepsis was 0.904 and 0.924, respectively.
Conclusions: We have identified several key genes and developed a machine learning model for sepsis identification. Further studies are needed to validate our findings.
期刊介绍:
Inflammation Research (IR) publishes peer-reviewed papers on all aspects of inflammation and related fields including histopathology, immunological mechanisms, gene expression, mediators, experimental models, clinical investigations and the effect of drugs. Related fields are broadly defined and include for instance, allergy and asthma, shock, pain, joint damage, skin disease as well as clinical trials of relevant drugs.