{"title":"Identification of hub gene and immune infiltration in Lyme disease revealed by weighted gene co-expression network analysis and machine learning.","authors":"Yan Dong, Meng Liu, Yanshuang Luo, Yantong Chen, Xuesong Chen, Xiaorong Liu, Xingbo Cai, Fusong Yang, Chao Song, Guozhong Zhou","doi":"10.1186/s40001-025-03108-y","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Lyme disease (LD), caused by the spirochete Borrelia burgdorferi (Bb), is a multisystem disorder with early symptoms such as erythema migrans and late manifestations including arthritis and neuroborreliosis. The molecular mechanisms driving tissue damage and inflammatory dysregulation in LD remain incompletely characterized. Given the central role of peripheral blood mononuclear cells (PBMCs) in orchestrating immune responses, we aimed to identify optimal feature genes (OFGs) within PBMCs associated with LD pathogenesis and delineate their immune infiltration patterns using integrated bioinformatics.</p><p><strong>Methods: </strong>Transcriptomic datasets (GSE42606, GSE68765, GSE103481) were retrieved from GEO. Differential expression analysis identified LD-related genes. Weighted Gene Co-expression Network Analysis (WGCNA) screened disease-associated modules. Feature selection was performed via SVM-Recursive Feature Elimination (SVM-RFE), Least absolute shrinkage and selection operator (LASSO) regression, and random forest (RF) to pinpoint OFGs. Immune cell infiltration was quantified using CIBERSORT, followed by correlation analysis between OFGs and immune subsets. The Single-gene gene set enrichment analysis (GSEA) was performed to explore the functional associations of OFGs. Biological pathways linked to OFGs were inferred by single-sample GSEA (ssGSEA). Diagnostic utility was assessed via ROC curves and nomogram modeling. Finally, we used RT-qPCR to confirm the bioinformatics results.</p><p><strong>Results: </strong>Our study identified 174 DEGs among the LD patients, with 156 genes located within the \"turquoise\" module by WGCNA, exhibiting the most robust correlation with clinical characteristics. Among these, KIAA1199 turned out to be the unique OFG, selected via three distinct machine learning methodologies, possessing exceptional diagnostic potential. The Single-gene gene set enrichment analysis showed KIAA1199 was strongly correlated with multiple immune-related pathways. Furthermore, RT-qPCR validated candidate gene expression within a THP-1 cellular model.</p><p><strong>Conclusion: </strong>In conclusion, this study integrated WGCNA and machine learning methodologies to identify one core gene associated with LD from PBMC gene expression data: KIAA1199. The predictive model constructed using these genes demonstrated robust diagnostic accuracy, providing a basis for further research on host immune responses and the development of new diagnostic methods.</p>","PeriodicalId":11949,"journal":{"name":"European Journal of Medical Research","volume":"30 1","pages":"860"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12465215/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Medical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s40001-025-03108-y","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Lyme disease (LD), caused by the spirochete Borrelia burgdorferi (Bb), is a multisystem disorder with early symptoms such as erythema migrans and late manifestations including arthritis and neuroborreliosis. The molecular mechanisms driving tissue damage and inflammatory dysregulation in LD remain incompletely characterized. Given the central role of peripheral blood mononuclear cells (PBMCs) in orchestrating immune responses, we aimed to identify optimal feature genes (OFGs) within PBMCs associated with LD pathogenesis and delineate their immune infiltration patterns using integrated bioinformatics.
Methods: Transcriptomic datasets (GSE42606, GSE68765, GSE103481) were retrieved from GEO. Differential expression analysis identified LD-related genes. Weighted Gene Co-expression Network Analysis (WGCNA) screened disease-associated modules. Feature selection was performed via SVM-Recursive Feature Elimination (SVM-RFE), Least absolute shrinkage and selection operator (LASSO) regression, and random forest (RF) to pinpoint OFGs. Immune cell infiltration was quantified using CIBERSORT, followed by correlation analysis between OFGs and immune subsets. The Single-gene gene set enrichment analysis (GSEA) was performed to explore the functional associations of OFGs. Biological pathways linked to OFGs were inferred by single-sample GSEA (ssGSEA). Diagnostic utility was assessed via ROC curves and nomogram modeling. Finally, we used RT-qPCR to confirm the bioinformatics results.
Results: Our study identified 174 DEGs among the LD patients, with 156 genes located within the "turquoise" module by WGCNA, exhibiting the most robust correlation with clinical characteristics. Among these, KIAA1199 turned out to be the unique OFG, selected via three distinct machine learning methodologies, possessing exceptional diagnostic potential. The Single-gene gene set enrichment analysis showed KIAA1199 was strongly correlated with multiple immune-related pathways. Furthermore, RT-qPCR validated candidate gene expression within a THP-1 cellular model.
Conclusion: In conclusion, this study integrated WGCNA and machine learning methodologies to identify one core gene associated with LD from PBMC gene expression data: KIAA1199. The predictive model constructed using these genes demonstrated robust diagnostic accuracy, providing a basis for further research on host immune responses and the development of new diagnostic methods.
期刊介绍:
European Journal of Medical Research publishes translational and clinical research of international interest across all medical disciplines, enabling clinicians and other researchers to learn about developments and innovations within these disciplines and across the boundaries between disciplines. The journal publishes high quality research and reviews and aims to ensure that the results of all well-conducted research are published, regardless of their outcome.