Integrating WGCNA and machine learning to distinguish active pulmonary tuberculosis from latent tuberculosis infection based on neutrophil extracellular trap-related genes.

IF 1.8 4区 医学 Q3 INFECTIOUS DISEASES
Tao Wang, Tao Lu, Weili Lu, Jiahuan He, Zhiyu Wu, Ying Lei
{"title":"Integrating WGCNA and machine learning to distinguish active pulmonary tuberculosis from latent tuberculosis infection based on neutrophil extracellular trap-related genes.","authors":"Tao Wang, Tao Lu, Weili Lu, Jiahuan He, Zhiyu Wu, Ying Lei","doi":"10.1016/j.diagmicrobio.2025.117053","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Pulmonary tuberculosis (PTB) remains a major global public health challenge, with diagnostic delays being a key factor contributing to its high morbidity and mortality. Growing evidence suggests that neutrophil extracellular traps (NETs) are closely associated with PTB pathogenesis. This study focuses on elucidating the role of NETs in PTB and identifying critical diagnostic methods and potential biomarkers.</p><p><strong>Methods: </strong>Weighted gene co-expression network analysis (WGCNA) was employed to identify the three modules most strongly correlated with NETs. Differentially expressed genes (DEGs) from GSE39939 dataset were intersected with module genes to obtain NET-related DEGs. Four machine learning algorithms (LASSO, random forest, RFE, and Boruta) were applied to select feature genes and develop a PTB diagnostic model. Model's performance was evaluated using support vector machine (SVM)-based receiver operating characteristic (ROC) and precision-recall (PR) curves, with validation in the GSE39940 dataset. The optimal algorithm was selected to refine feature genes and construct a miRNA-gene regulatory network.</p><p><strong>Results: </strong>ROC and PR curve analyses revealed that RFE and Boruta algorithms exhibited superior diagnostic efficacy in distinguishing active PTB from latent TB infection (LTBI). Further analysis identified five overlapping high-ranking feature genes (GPR84, SIGLEC10, CCR2, TMEM167A, and GYG1) between the RFE and Boruta algorithms. hsa-miR-1264, hsa-miR-664a-3p, hsa-miR-548e-5p, hsa-miR-4775, and hsa-miR-5056 were predicted to potentially target these genes.</p><p><strong>Conclusion: </strong>RFE algorithm achieves high diagnostic accuracy for PTB and identifies five potential biomarkers (GPR84, SIGLEC10, CCR2, TMEM167A, and GYG1). These findings may provide valuable tools for PTB diagnosis and treatment.</p>","PeriodicalId":11329,"journal":{"name":"Diagnostic microbiology and infectious disease","volume":"113 4","pages":"117053"},"PeriodicalIF":1.8000,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diagnostic microbiology and infectious disease","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.diagmicrobio.2025.117053","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/5 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Pulmonary tuberculosis (PTB) remains a major global public health challenge, with diagnostic delays being a key factor contributing to its high morbidity and mortality. Growing evidence suggests that neutrophil extracellular traps (NETs) are closely associated with PTB pathogenesis. This study focuses on elucidating the role of NETs in PTB and identifying critical diagnostic methods and potential biomarkers.

Methods: Weighted gene co-expression network analysis (WGCNA) was employed to identify the three modules most strongly correlated with NETs. Differentially expressed genes (DEGs) from GSE39939 dataset were intersected with module genes to obtain NET-related DEGs. Four machine learning algorithms (LASSO, random forest, RFE, and Boruta) were applied to select feature genes and develop a PTB diagnostic model. Model's performance was evaluated using support vector machine (SVM)-based receiver operating characteristic (ROC) and precision-recall (PR) curves, with validation in the GSE39940 dataset. The optimal algorithm was selected to refine feature genes and construct a miRNA-gene regulatory network.

Results: ROC and PR curve analyses revealed that RFE and Boruta algorithms exhibited superior diagnostic efficacy in distinguishing active PTB from latent TB infection (LTBI). Further analysis identified five overlapping high-ranking feature genes (GPR84, SIGLEC10, CCR2, TMEM167A, and GYG1) between the RFE and Boruta algorithms. hsa-miR-1264, hsa-miR-664a-3p, hsa-miR-548e-5p, hsa-miR-4775, and hsa-miR-5056 were predicted to potentially target these genes.

Conclusion: RFE algorithm achieves high diagnostic accuracy for PTB and identifies five potential biomarkers (GPR84, SIGLEC10, CCR2, TMEM167A, and GYG1). These findings may provide valuable tools for PTB diagnosis and treatment.

结合WGCNA和机器学习,基于中性粒细胞胞外陷阱相关基因区分活动性肺结核和潜伏性肺结核。
背景:肺结核(PTB)仍然是一个主要的全球公共卫生挑战,诊断延误是导致其高发病率和死亡率的关键因素。越来越多的证据表明,中性粒细胞胞外陷阱(NETs)与PTB的发病密切相关。本研究的重点是阐明NETs在肺结核中的作用,并确定关键的诊断方法和潜在的生物标志物。方法:采用加权基因共表达网络分析(Weighted gene co-expression network analysis, WGCNA)识别与NETs相关性最强的3个模块。将GSE39939数据集中的差异表达基因(differential expression genes, DEGs)与模块基因相交,得到与net相关的差异表达基因。采用LASSO、随机森林、RFE和Boruta四种机器学习算法选择特征基因,建立PTB诊断模型。利用基于支持向量机(SVM)的接收机工作特征(ROC)和精确召回率(PR)曲线对模型的性能进行评估,并在GSE39940数据集上进行验证。选择最优算法细化特征基因,构建mirna -基因调控网络。结果:ROC和PR曲线分析显示,RFE和Boruta算法在区分活动性肺结核和潜伏性肺结核感染(LTBI)方面具有较好的诊断效果。进一步分析发现RFE和Boruta算法之间有5个重叠的高级特征基因(GPR84、SIGLEC10、CCR2、TMEM167A和GYG1)。预测hsa-miR-1264、hsa-miR-664a-3p、hsa-miR-548e-5p、hsa-miR-4775和hsa-miR-5056可能靶向这些基因。结论:RFE算法对PTB具有较高的诊断准确率,可识别出5个潜在的生物标志物(GPR84、SIGLEC10、CCR2、TMEM167A、GYG1)。这些发现可能为肺结核的诊断和治疗提供有价值的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.30
自引率
3.40%
发文量
149
审稿时长
56 days
期刊介绍: Diagnostic Microbiology and Infectious Disease keeps you informed of the latest developments in clinical microbiology and the diagnosis and treatment of infectious diseases. Packed with rigorously peer-reviewed articles and studies in bacteriology, immunology, immunoserology, infectious diseases, mycology, parasitology, and virology, the journal examines new procedures, unusual cases, controversial issues, and important new literature. Diagnostic Microbiology and Infectious Disease distinguished independent editorial board, consisting of experts from many medical specialties, ensures you extensive and authoritative coverage.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信