Identification of immune-associated biomarkers of diabetes nephropathy tubulointerstitial injury based on machine learning: a bioinformatics multi-chip integrated analysis.

IF 4 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Lin Wang, Jiaming Su, Zhongjie Liu, Shaowei Ding, Yaotan Li, Baoluo Hou, Yuxin Hu, Zhaoxi Dong, Jingyi Tang, Hongfang Liu, Weijing Liu
{"title":"Identification of immune-associated biomarkers of diabetes nephropathy tubulointerstitial injury based on machine learning: a bioinformatics multi-chip integrated analysis.","authors":"Lin Wang, Jiaming Su, Zhongjie Liu, Shaowei Ding, Yaotan Li, Baoluo Hou, Yuxin Hu, Zhaoxi Dong, Jingyi Tang, Hongfang Liu, Weijing Liu","doi":"10.1186/s13040-024-00369-x","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Diabetic nephropathy (DN) is a major microvascular complication of diabetes and has become the leading cause of end-stage renal disease worldwide. A considerable number of DN patients have experienced irreversible end-stage renal disease progression due to the inability to diagnose the disease early. Therefore, reliable biomarkers that are helpful for early diagnosis and treatment are identified. The migration of immune cells to the kidney is considered to be a key step in the progression of DN-related vascular injury. Therefore, finding markers in this process may be more helpful for the early diagnosis and progression prediction of DN.</p><p><strong>Methods: </strong>The gene chip data were retrieved from the GEO database using the search term ' diabetic nephropathy '. The ' limma ' software package was used to identify differentially expressed genes (DEGs) between DN and control samples. Gene set enrichment analysis (GSEA) was performed on genes obtained from the molecular characteristic database (MSigDB. The R package 'WGCNA' was used to identify gene modules associated with tubulointerstitial injury in DN, and it was crossed with immune-related DEGs to identify target genes. Gene ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed on differentially expressed genes using the 'ClusterProfiler' software package in R. Three methods, least absolute shrinkage and selection operator (LASSO), support vector machine recursive feature elimination (SVM-RFE) and random forest (RF), were used to select immune-related biomarkers for diagnosis. We retrieved the tubulointerstitial dataset from the Nephroseq database to construct an external validation dataset. Unsupervised clustering analysis of the expression levels of immune-related biomarkers was performed using the 'ConsensusClusterPlus 'R software package. The urine of patients who visited Dongzhimen Hospital of Beijing University of Chinese Medicine from September 2021 to March 2023 was collected, and Elisa was used to detect the mRNA expression level of immune-related biomarkers in urine. Pearson correlation analysis was used to detect the effect of immune-related biomarker expression on renal function in DN patients.</p><p><strong>Results: </strong>Four microarray datasets from the GEO database are included in the analysis : GSE30122, GSE47185, GSE99340 and GSE104954. These datasets included 63 DN patients and 55 healthy controls. A total of 9415 genes were detected in the data set. We found 153 differentially expressed immune-related genes, of which 112 genes were up-regulated, 41 genes were down-regulated, and 119 overlapping genes were identified. GO analysis showed that they were involved in various biological processes including leukocyte-mediated immunity. KEGG analysis showed that these target genes were mainly involved in the formation of phagosomes in Staphylococcus aureus infection. Among these 119 overlapping genes, machine learning results identified AGR2, CCR2, CEBPD, CISH, CX3CR1, DEFB1 and FSTL1 as potential tubulointerstitial immune-related biomarkers. External validation suggested that the above markers showed diagnostic efficacy in distinguishing DN patients from healthy controls. Clinical studies have shown that the expression of AGR2, CX3CR1 and FSTL1 in urine samples of DN patients is negatively correlated with GFR, the expression of CX3CR1 and FSTL1 in urine samples of DN is positively correlated with serum creatinine, while the expression of DEFB1 in urine samples of DN is negatively correlated with serum creatinine. In addition, the expression of CX3CR1 in DN urine samples was positively correlated with proteinuria, while the expression of DEFB1 in DN urine samples was negatively correlated with proteinuria. Finally, according to the level of proteinuria, DN patients were divided into nephrotic proteinuria group (n = 24) and subrenal proteinuria group. There were significant differences in urinary AGR2, CCR2 and DEFB1 between the two groups by unpaired t test (P < 0.05).</p><p><strong>Conclusions: </strong>Our study provides new insights into the role of immune-related biomarkers in DN tubulointerstitial injury and provides potential targets for early diagnosis and treatment of DN patients. Seven different genes ( AGR2, CCR2, CEBPD, CISH, CX3CR1, DEFB1, FSTL1 ), as promising sensitive biomarkers, may affect the progression of DN by regulating immune inflammatory response. However, further comprehensive studies are needed to fully understand their exact molecular mechanisms and functional pathways in DN.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":null,"pages":null},"PeriodicalIF":4.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11218417/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-024-00369-x","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Diabetic nephropathy (DN) is a major microvascular complication of diabetes and has become the leading cause of end-stage renal disease worldwide. A considerable number of DN patients have experienced irreversible end-stage renal disease progression due to the inability to diagnose the disease early. Therefore, reliable biomarkers that are helpful for early diagnosis and treatment are identified. The migration of immune cells to the kidney is considered to be a key step in the progression of DN-related vascular injury. Therefore, finding markers in this process may be more helpful for the early diagnosis and progression prediction of DN.

Methods: The gene chip data were retrieved from the GEO database using the search term ' diabetic nephropathy '. The ' limma ' software package was used to identify differentially expressed genes (DEGs) between DN and control samples. Gene set enrichment analysis (GSEA) was performed on genes obtained from the molecular characteristic database (MSigDB. The R package 'WGCNA' was used to identify gene modules associated with tubulointerstitial injury in DN, and it was crossed with immune-related DEGs to identify target genes. Gene ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed on differentially expressed genes using the 'ClusterProfiler' software package in R. Three methods, least absolute shrinkage and selection operator (LASSO), support vector machine recursive feature elimination (SVM-RFE) and random forest (RF), were used to select immune-related biomarkers for diagnosis. We retrieved the tubulointerstitial dataset from the Nephroseq database to construct an external validation dataset. Unsupervised clustering analysis of the expression levels of immune-related biomarkers was performed using the 'ConsensusClusterPlus 'R software package. The urine of patients who visited Dongzhimen Hospital of Beijing University of Chinese Medicine from September 2021 to March 2023 was collected, and Elisa was used to detect the mRNA expression level of immune-related biomarkers in urine. Pearson correlation analysis was used to detect the effect of immune-related biomarker expression on renal function in DN patients.

Results: Four microarray datasets from the GEO database are included in the analysis : GSE30122, GSE47185, GSE99340 and GSE104954. These datasets included 63 DN patients and 55 healthy controls. A total of 9415 genes were detected in the data set. We found 153 differentially expressed immune-related genes, of which 112 genes were up-regulated, 41 genes were down-regulated, and 119 overlapping genes were identified. GO analysis showed that they were involved in various biological processes including leukocyte-mediated immunity. KEGG analysis showed that these target genes were mainly involved in the formation of phagosomes in Staphylococcus aureus infection. Among these 119 overlapping genes, machine learning results identified AGR2, CCR2, CEBPD, CISH, CX3CR1, DEFB1 and FSTL1 as potential tubulointerstitial immune-related biomarkers. External validation suggested that the above markers showed diagnostic efficacy in distinguishing DN patients from healthy controls. Clinical studies have shown that the expression of AGR2, CX3CR1 and FSTL1 in urine samples of DN patients is negatively correlated with GFR, the expression of CX3CR1 and FSTL1 in urine samples of DN is positively correlated with serum creatinine, while the expression of DEFB1 in urine samples of DN is negatively correlated with serum creatinine. In addition, the expression of CX3CR1 in DN urine samples was positively correlated with proteinuria, while the expression of DEFB1 in DN urine samples was negatively correlated with proteinuria. Finally, according to the level of proteinuria, DN patients were divided into nephrotic proteinuria group (n = 24) and subrenal proteinuria group. There were significant differences in urinary AGR2, CCR2 and DEFB1 between the two groups by unpaired t test (P < 0.05).

Conclusions: Our study provides new insights into the role of immune-related biomarkers in DN tubulointerstitial injury and provides potential targets for early diagnosis and treatment of DN patients. Seven different genes ( AGR2, CCR2, CEBPD, CISH, CX3CR1, DEFB1, FSTL1 ), as promising sensitive biomarkers, may affect the progression of DN by regulating immune inflammatory response. However, further comprehensive studies are needed to fully understand their exact molecular mechanisms and functional pathways in DN.

基于机器学习的糖尿病肾病肾小管间质损伤免疫相关生物标记物的鉴定:生物信息学多芯片综合分析。
背景:糖尿病肾病(DN)是糖尿病的主要微血管并发症,已成为全球终末期肾病的主要病因。由于无法早期诊断,相当多的 DN 患者经历了不可逆转的终末期肾病进展。因此,需要找到有助于早期诊断和治疗的可靠生物标志物。免疫细胞向肾脏的迁移被认为是 DN 相关血管损伤进展的关键步骤。因此,寻找这一过程中的标记物可能更有助于 DN 的早期诊断和进展预测:方法:以 "糖尿病肾病 "为检索词,从 GEO 数据库中检索基因芯片数据。使用 "limma "软件包鉴定 DN 和对照样本之间的差异表达基因(DEGs)。对分子特征数据库(MSigDB)中获得的基因进行了基因组富集分析(GSEA)。使用 R 软件包 "WGCNA "识别与 DN 中肾小管间质损伤相关的基因模块,并与免疫相关的 DEGs 交叉以识别目标基因。利用R软件包 "ClusterProfiler "对差异表达基因进行了基因本体(GO)富集分析和京都基因组百科全书(KEGG)通路分析,并采用最小绝对收缩和选择算子(LASSO)、支持向量机递归特征消除(SVM-RFE)和随机森林(RF)三种方法筛选出用于诊断的免疫相关生物标志物。我们从 Nephroseq 数据库中检索了肾小管间质数据集,以构建外部验证数据集。我们使用 "ConsensusClusterPlus "R软件包对免疫相关生物标志物的表达水平进行了无监督聚类分析。收集2021年9月至2023年3月在北京中医药大学东直门医院就诊的患者尿液,用Elisa检测尿液中免疫相关生物标志物的mRNA表达水平。采用皮尔逊相关分析检测免疫相关生物标志物表达对DN患者肾功能的影响:分析包括 GEO 数据库中的四个微阵列数据集:GSE30122、GSE47185、GSE99340 和 GSE104954。这些数据集包括 63 名 DN 患者和 55 名健康对照者。数据集中共检测到 9415 个基因。我们发现了 153 个差异表达的免疫相关基因,其中 112 个基因上调,41 个基因下调,119 个基因重叠。GO 分析表明,这些基因参与了各种生物过程,包括白细胞介导的免疫。KEGG 分析显示,这些目标基因主要参与了金黄色葡萄球菌感染过程中吞噬体的形成。在这 119 个重叠基因中,机器学习结果发现 AGR2、CCR2、CEBPD、CISH、CX3CR1、DEFB1 和 FSTL1 是潜在的肾小管间质免疫相关生物标记。外部验证表明,上述标记物在区分 DN 患者和健康对照组方面具有诊断功效。临床研究表明,DN 患者尿样中 AGR2、CX3CR1 和 FSTL1 的表达与 GFR 呈负相关,DN 患者尿样中 CX3CR1 和 FSTL1 的表达与血清肌酐呈正相关,而 DN 患者尿样中 DEFB1 的表达与血清肌酐呈负相关。此外,DN 尿样中 CX3CR1 的表达与蛋白尿呈正相关,而 DN 尿样中 DEFB1 的表达与蛋白尿呈负相关。最后,根据蛋白尿的程度,将 DN 患者分为肾病性蛋白尿组(24 人)和肾下性蛋白尿组。经非配对 t 检验,两组患者尿液中 AGR2、CCR2 和 DEFB1 的含量存在明显差异(P 结论:DN 患者的尿液中 AGR2、CCR2 和 DEFB1 的含量均高于肾病蛋白尿组:我们的研究为免疫相关生物标志物在 DN 肾小管间质损伤中的作用提供了新的见解,并为 DN 患者的早期诊断和治疗提供了潜在的靶点。七个不同的基因(AGR2、CCR2、CEBPD、CISH、CX3CR1、DEFB1、FSTL1)作为有希望的敏感生物标志物,可能通过调节免疫炎症反应影响 DN 的进展。然而,要全面了解它们在 DN 中的确切分子机制和功能通路,还需要进一步的综合研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biodata Mining
Biodata Mining MATHEMATICAL & COMPUTATIONAL BIOLOGY-
CiteScore
7.90
自引率
0.00%
发文量
28
审稿时长
23 weeks
期刊介绍: BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data. Topical areas include, but are not limited to: -Development, evaluation, and application of novel data mining and machine learning algorithms. -Adaptation, evaluation, and application of traditional data mining and machine learning algorithms. -Open-source software for the application of data mining and machine learning algorithms. -Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies. -Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信