{"title":"生物信息学洞察和XGBoost鉴定慢性阻塞性肺疾病和2型糖尿病的共同遗传","authors":"Qianqian Ji, Yaxian Meng, Xiaojie Han, Chao Yi, Xiaoliang Chen, Yiqiang Zhan","doi":"10.1111/crj.70057","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>The correlation between chronic obstructive pulmonary disease (COPD) and Type 2 diabetes mellitus (T2DM) has long been recognized, but their shared molecular underpinnings remain elusive. This study aims to uncover common genetic markers and pathways in COPD and T2DM, providing insights into their molecular crosstalk.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>Utilizing the Gene Expression Omnibus (GEO) database, we analyzed gene expression datasets from six COPD and five T2DM studies. A multifaceted bioinformatics approach, encompassing the limma R package, unified matrix analysis, and weighted gene co-expression network analysis (WGCNA), was deployed to identify differentially expressed genes (DEGs) and hub genes. Functional enrichment and protein–protein interaction (PPI) analyses were conducted, followed by cross-species validation in <i>Mus musculus</i> models. Machine learning techniques, including random forest and LASSO regression, were applied for further validation, culminating in the development of a prognostic model using XGBoost.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Our analysis revealed shared DEGs such as <i>KIF1C</i>, <i>CSTA</i>, <i>GMNN</i>, and <i>PHGDH</i> in both COPD and T2DM. Cross-species comparison identified common genes including <i>PON1</i> and <i>CD14</i>, exhibiting varying expression patterns. The random forest and LASSO regression identified six critical genes, with our XGBoost model demonstrating significant predictive accuracy (AUC = 0.996 for COPD).</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>This study identifies key genetic markers shared between COPD and T2DM, providing new insights into their molecular pathways. Our XGBoost model exhibited high predictive accuracy for COPD, highlighting the potential utility of these markers. These findings offer promising biomarkers for early detection and enhance our understanding of the diseases' interplay. Further validation in larger cohorts is recommended.</p>\n </section>\n </div>","PeriodicalId":55247,"journal":{"name":"Clinical Respiratory Journal","volume":"19 3","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/crj.70057","citationCount":"0","resultStr":"{\"title\":\"Bioinformatic Insights and XGBoost Identify Shared Genetics in Chronic Obstructive Pulmonary Disease and Type 2 Diabetes\",\"authors\":\"Qianqian Ji, Yaxian Meng, Xiaojie Han, Chao Yi, Xiaoliang Chen, Yiqiang Zhan\",\"doi\":\"10.1111/crj.70057\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>The correlation between chronic obstructive pulmonary disease (COPD) and Type 2 diabetes mellitus (T2DM) has long been recognized, but their shared molecular underpinnings remain elusive. This study aims to uncover common genetic markers and pathways in COPD and T2DM, providing insights into their molecular crosstalk.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>Utilizing the Gene Expression Omnibus (GEO) database, we analyzed gene expression datasets from six COPD and five T2DM studies. A multifaceted bioinformatics approach, encompassing the limma R package, unified matrix analysis, and weighted gene co-expression network analysis (WGCNA), was deployed to identify differentially expressed genes (DEGs) and hub genes. Functional enrichment and protein–protein interaction (PPI) analyses were conducted, followed by cross-species validation in <i>Mus musculus</i> models. Machine learning techniques, including random forest and LASSO regression, were applied for further validation, culminating in the development of a prognostic model using XGBoost.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>Our analysis revealed shared DEGs such as <i>KIF1C</i>, <i>CSTA</i>, <i>GMNN</i>, and <i>PHGDH</i> in both COPD and T2DM. Cross-species comparison identified common genes including <i>PON1</i> and <i>CD14</i>, exhibiting varying expression patterns. The random forest and LASSO regression identified six critical genes, with our XGBoost model demonstrating significant predictive accuracy (AUC = 0.996 for COPD).</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>This study identifies key genetic markers shared between COPD and T2DM, providing new insights into their molecular pathways. Our XGBoost model exhibited high predictive accuracy for COPD, highlighting the potential utility of these markers. These findings offer promising biomarkers for early detection and enhance our understanding of the diseases' interplay. Further validation in larger cohorts is recommended.</p>\\n </section>\\n </div>\",\"PeriodicalId\":55247,\"journal\":{\"name\":\"Clinical Respiratory Journal\",\"volume\":\"19 3\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/crj.70057\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Respiratory Journal\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/crj.70057\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RESPIRATORY SYSTEM\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Respiratory Journal","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/crj.70057","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RESPIRATORY SYSTEM","Score":null,"Total":0}
Bioinformatic Insights and XGBoost Identify Shared Genetics in Chronic Obstructive Pulmonary Disease and Type 2 Diabetes
Background
The correlation between chronic obstructive pulmonary disease (COPD) and Type 2 diabetes mellitus (T2DM) has long been recognized, but their shared molecular underpinnings remain elusive. This study aims to uncover common genetic markers and pathways in COPD and T2DM, providing insights into their molecular crosstalk.
Methods
Utilizing the Gene Expression Omnibus (GEO) database, we analyzed gene expression datasets from six COPD and five T2DM studies. A multifaceted bioinformatics approach, encompassing the limma R package, unified matrix analysis, and weighted gene co-expression network analysis (WGCNA), was deployed to identify differentially expressed genes (DEGs) and hub genes. Functional enrichment and protein–protein interaction (PPI) analyses were conducted, followed by cross-species validation in Mus musculus models. Machine learning techniques, including random forest and LASSO regression, were applied for further validation, culminating in the development of a prognostic model using XGBoost.
Results
Our analysis revealed shared DEGs such as KIF1C, CSTA, GMNN, and PHGDH in both COPD and T2DM. Cross-species comparison identified common genes including PON1 and CD14, exhibiting varying expression patterns. The random forest and LASSO regression identified six critical genes, with our XGBoost model demonstrating significant predictive accuracy (AUC = 0.996 for COPD).
Conclusions
This study identifies key genetic markers shared between COPD and T2DM, providing new insights into their molecular pathways. Our XGBoost model exhibited high predictive accuracy for COPD, highlighting the potential utility of these markers. These findings offer promising biomarkers for early detection and enhance our understanding of the diseases' interplay. Further validation in larger cohorts is recommended.
期刊介绍:
Overview
Effective with the 2016 volume, this journal will be published in an online-only format.
Aims and Scope
The Clinical Respiratory Journal (CRJ) provides a forum for clinical research in all areas of respiratory medicine from clinical lung disease to basic research relevant to the clinic.
We publish original research, review articles, case studies, editorials and book reviews in all areas of clinical lung disease including:
Asthma
Allergy
COPD
Non-invasive ventilation
Sleep related breathing disorders
Interstitial lung diseases
Lung cancer
Clinical genetics
Rhinitis
Airway and lung infection
Epidemiology
Pediatrics
CRJ provides a fast-track service for selected Phase II and Phase III trial studies.
Keywords
Clinical Respiratory Journal, respiratory, pulmonary, medicine, clinical, lung disease,
Abstracting and Indexing Information
Academic Search (EBSCO Publishing)
Academic Search Alumni Edition (EBSCO Publishing)
Embase (Elsevier)
Health & Medical Collection (ProQuest)
Health Research Premium Collection (ProQuest)
HEED: Health Economic Evaluations Database (Wiley-Blackwell)
Hospital Premium Collection (ProQuest)
Journal Citation Reports/Science Edition (Clarivate Analytics)
MEDLINE/PubMed (NLM)
ProQuest Central (ProQuest)
Science Citation Index Expanded (Clarivate Analytics)
SCOPUS (Elsevier)