生物信息学洞察和XGBoost鉴定慢性阻塞性肺疾病和2型糖尿病的共同遗传

IF 2.3 4区 医学 Q3 RESPIRATORY SYSTEM
Qianqian Ji, Yaxian Meng, Xiaojie Han, Chao Yi, Xiaoliang Chen, Yiqiang Zhan
{"title":"生物信息学洞察和XGBoost鉴定慢性阻塞性肺疾病和2型糖尿病的共同遗传","authors":"Qianqian Ji,&nbsp;Yaxian Meng,&nbsp;Xiaojie Han,&nbsp;Chao Yi,&nbsp;Xiaoliang Chen,&nbsp;Yiqiang Zhan","doi":"10.1111/crj.70057","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>The correlation between chronic obstructive pulmonary disease (COPD) and Type 2 diabetes mellitus (T2DM) has long been recognized, but their shared molecular underpinnings remain elusive. This study aims to uncover common genetic markers and pathways in COPD and T2DM, providing insights into their molecular crosstalk.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>Utilizing the Gene Expression Omnibus (GEO) database, we analyzed gene expression datasets from six COPD and five T2DM studies. A multifaceted bioinformatics approach, encompassing the limma R package, unified matrix analysis, and weighted gene co-expression network analysis (WGCNA), was deployed to identify differentially expressed genes (DEGs) and hub genes. Functional enrichment and protein–protein interaction (PPI) analyses were conducted, followed by cross-species validation in <i>Mus musculus</i> models. Machine learning techniques, including random forest and LASSO regression, were applied for further validation, culminating in the development of a prognostic model using XGBoost.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Our analysis revealed shared DEGs such as <i>KIF1C</i>, <i>CSTA</i>, <i>GMNN</i>, and <i>PHGDH</i> in both COPD and T2DM. Cross-species comparison identified common genes including <i>PON1</i> and <i>CD14</i>, exhibiting varying expression patterns. The random forest and LASSO regression identified six critical genes, with our XGBoost model demonstrating significant predictive accuracy (AUC = 0.996 for COPD).</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>This study identifies key genetic markers shared between COPD and T2DM, providing new insights into their molecular pathways. Our XGBoost model exhibited high predictive accuracy for COPD, highlighting the potential utility of these markers. These findings offer promising biomarkers for early detection and enhance our understanding of the diseases' interplay. Further validation in larger cohorts is recommended.</p>\n </section>\n </div>","PeriodicalId":55247,"journal":{"name":"Clinical Respiratory Journal","volume":"19 3","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/crj.70057","citationCount":"0","resultStr":"{\"title\":\"Bioinformatic Insights and XGBoost Identify Shared Genetics in Chronic Obstructive Pulmonary Disease and Type 2 Diabetes\",\"authors\":\"Qianqian Ji,&nbsp;Yaxian Meng,&nbsp;Xiaojie Han,&nbsp;Chao Yi,&nbsp;Xiaoliang Chen,&nbsp;Yiqiang Zhan\",\"doi\":\"10.1111/crj.70057\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>The correlation between chronic obstructive pulmonary disease (COPD) and Type 2 diabetes mellitus (T2DM) has long been recognized, but their shared molecular underpinnings remain elusive. This study aims to uncover common genetic markers and pathways in COPD and T2DM, providing insights into their molecular crosstalk.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>Utilizing the Gene Expression Omnibus (GEO) database, we analyzed gene expression datasets from six COPD and five T2DM studies. A multifaceted bioinformatics approach, encompassing the limma R package, unified matrix analysis, and weighted gene co-expression network analysis (WGCNA), was deployed to identify differentially expressed genes (DEGs) and hub genes. Functional enrichment and protein–protein interaction (PPI) analyses were conducted, followed by cross-species validation in <i>Mus musculus</i> models. Machine learning techniques, including random forest and LASSO regression, were applied for further validation, culminating in the development of a prognostic model using XGBoost.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>Our analysis revealed shared DEGs such as <i>KIF1C</i>, <i>CSTA</i>, <i>GMNN</i>, and <i>PHGDH</i> in both COPD and T2DM. Cross-species comparison identified common genes including <i>PON1</i> and <i>CD14</i>, exhibiting varying expression patterns. The random forest and LASSO regression identified six critical genes, with our XGBoost model demonstrating significant predictive accuracy (AUC = 0.996 for COPD).</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>This study identifies key genetic markers shared between COPD and T2DM, providing new insights into their molecular pathways. Our XGBoost model exhibited high predictive accuracy for COPD, highlighting the potential utility of these markers. These findings offer promising biomarkers for early detection and enhance our understanding of the diseases' interplay. Further validation in larger cohorts is recommended.</p>\\n </section>\\n </div>\",\"PeriodicalId\":55247,\"journal\":{\"name\":\"Clinical Respiratory Journal\",\"volume\":\"19 3\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/crj.70057\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Respiratory Journal\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/crj.70057\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RESPIRATORY SYSTEM\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Respiratory Journal","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/crj.70057","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RESPIRATORY SYSTEM","Score":null,"Total":0}
引用次数: 0

摘要

慢性阻塞性肺疾病(COPD)和2型糖尿病(T2DM)之间的相关性早已被认识到,但它们共同的分子基础仍然是未知的。本研究旨在揭示COPD和T2DM的共同遗传标记和途径,为它们的分子串扰提供见解。方法利用基因表达综合数据库(Gene Expression Omnibus, GEO)对6例COPD和5例T2DM研究的基因表达数据集进行分析。采用多方面的生物信息学方法,包括limma R软件包、统一矩阵分析和加权基因共表达网络分析(WGCNA),以识别差异表达基因(deg)和枢纽基因。进行功能富集和蛋白-蛋白相互作用(PPI)分析,然后在小家鼠模型中进行跨物种验证。包括随机森林和LASSO回归在内的机器学习技术被应用于进一步验证,最终使用XGBoost开发了一个预测模型。结果:我们的分析揭示了COPD和T2DM患者的共同deg,如KIF1C、CSTA、GMNN和PHGDH。跨物种比较发现了包括PON1和CD14在内的共同基因,表现出不同的表达模式。随机森林和LASSO回归确定了6个关键基因,我们的XGBoost模型对COPD的预测精度显著(AUC = 0.996)。本研究确定了COPD和T2DM之间共享的关键遗传标记,为其分子通路提供了新的见解。我们的XGBoost模型显示出对COPD的高预测准确性,突出了这些标记物的潜在效用。这些发现为早期检测提供了有希望的生物标志物,并增强了我们对疾病相互作用的理解。建议在更大的队列中进一步验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Bioinformatic Insights and XGBoost Identify Shared Genetics in Chronic Obstructive Pulmonary Disease and Type 2 Diabetes

Bioinformatic Insights and XGBoost Identify Shared Genetics in Chronic Obstructive Pulmonary Disease and Type 2 Diabetes

Background

The correlation between chronic obstructive pulmonary disease (COPD) and Type 2 diabetes mellitus (T2DM) has long been recognized, but their shared molecular underpinnings remain elusive. This study aims to uncover common genetic markers and pathways in COPD and T2DM, providing insights into their molecular crosstalk.

Methods

Utilizing the Gene Expression Omnibus (GEO) database, we analyzed gene expression datasets from six COPD and five T2DM studies. A multifaceted bioinformatics approach, encompassing the limma R package, unified matrix analysis, and weighted gene co-expression network analysis (WGCNA), was deployed to identify differentially expressed genes (DEGs) and hub genes. Functional enrichment and protein–protein interaction (PPI) analyses were conducted, followed by cross-species validation in Mus musculus models. Machine learning techniques, including random forest and LASSO regression, were applied for further validation, culminating in the development of a prognostic model using XGBoost.

Results

Our analysis revealed shared DEGs such as KIF1C, CSTA, GMNN, and PHGDH in both COPD and T2DM. Cross-species comparison identified common genes including PON1 and CD14, exhibiting varying expression patterns. The random forest and LASSO regression identified six critical genes, with our XGBoost model demonstrating significant predictive accuracy (AUC = 0.996 for COPD).

Conclusions

This study identifies key genetic markers shared between COPD and T2DM, providing new insights into their molecular pathways. Our XGBoost model exhibited high predictive accuracy for COPD, highlighting the potential utility of these markers. These findings offer promising biomarkers for early detection and enhance our understanding of the diseases' interplay. Further validation in larger cohorts is recommended.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Clinical Respiratory Journal
Clinical Respiratory Journal 医学-呼吸系统
CiteScore
3.70
自引率
0.00%
发文量
104
审稿时长
>12 weeks
期刊介绍: Overview Effective with the 2016 volume, this journal will be published in an online-only format. Aims and Scope The Clinical Respiratory Journal (CRJ) provides a forum for clinical research in all areas of respiratory medicine from clinical lung disease to basic research relevant to the clinic. We publish original research, review articles, case studies, editorials and book reviews in all areas of clinical lung disease including: Asthma Allergy COPD Non-invasive ventilation Sleep related breathing disorders Interstitial lung diseases Lung cancer Clinical genetics Rhinitis Airway and lung infection Epidemiology Pediatrics CRJ provides a fast-track service for selected Phase II and Phase III trial studies. Keywords Clinical Respiratory Journal, respiratory, pulmonary, medicine, clinical, lung disease, Abstracting and Indexing Information Academic Search (EBSCO Publishing) Academic Search Alumni Edition (EBSCO Publishing) Embase (Elsevier) Health & Medical Collection (ProQuest) Health Research Premium Collection (ProQuest) HEED: Health Economic Evaluations Database (Wiley-Blackwell) Hospital Premium Collection (ProQuest) Journal Citation Reports/Science Edition (Clarivate Analytics) MEDLINE/PubMed (NLM) ProQuest Central (ProQuest) Science Citation Index Expanded (Clarivate Analytics) SCOPUS (Elsevier)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信