Identification of Novel Biomarkers for Crohn's Disease Through the Integration of Machine Learning, Colocalization, and SMR Analysis

IF 4.2 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY
Liang Chen, Jie Hua
{"title":"Identification of Novel Biomarkers for Crohn's Disease Through the Integration of Machine Learning, Colocalization, and SMR Analysis","authors":"Liang Chen,&nbsp;Jie Hua","doi":"10.1096/fj.202504792R","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Crohn's disease (CD) is a chronic inflammatory bowel disease with a prevalence rate increasing with time, thus demanding improved diagnostic and therapeutic strategies. The present work focused on identifying the candidate biomarkers for CD diagnosis and treatment. Gene Expression Omnibus (GEO)-derived CD-related gene expression datasets were analyzed. Differential protein–protein interaction network and weighted gene co-expression network analyses were conducted to prioritize the core candidate genes. Multiple machine learning algorithms were used to further refine these candidates. The feature importance of the model with the highest performance was explained using SHapley Additive exPlanations. Additionally, a single-sample gene set enrichment analysis was carried out to evaluate immune cell infiltration and determine the associations with diagnostic markers. In addition, the causal biomarker genes were identified using Bayesian colocalization and the summary data-based Mendelian randomization (SMR) analysis. The combination of glmBoost and random forest machine learning analysis identified five hub genes (<i>CXCL5</i>, <i>SERPINB2</i>, <i>SOCS3</i>, <i>PF4</i>, and <i>IL1R1</i>), which demonstrated robust diagnostic performance for CD. These biomarkers were correlated with the immune cell infiltration patterns indicative of heightened inflammation and Th1/Th17 adaptive immune responses. Colocalization and SMR analyses established a causal association of <i>IL1R1</i> with CD development. This integrative multiomics approach identified the key biomarkers involved in the pathogenic mechanism of CD. The eQTL data based SMR analysis suggested a significant association of <i>IL1R1</i> with CD risk, highlighting its dual effects as a diagnostic biomarker and therapeutic target.</p>\n </div>","PeriodicalId":50455,"journal":{"name":"The FASEB Journal","volume":"40 5","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The FASEB Journal","FirstCategoryId":"99","ListUrlMain":"https://faseb.onlinelibrary.wiley.com/doi/10.1096/fj.202504792R","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Crohn's disease (CD) is a chronic inflammatory bowel disease with a prevalence rate increasing with time, thus demanding improved diagnostic and therapeutic strategies. The present work focused on identifying the candidate biomarkers for CD diagnosis and treatment. Gene Expression Omnibus (GEO)-derived CD-related gene expression datasets were analyzed. Differential protein–protein interaction network and weighted gene co-expression network analyses were conducted to prioritize the core candidate genes. Multiple machine learning algorithms were used to further refine these candidates. The feature importance of the model with the highest performance was explained using SHapley Additive exPlanations. Additionally, a single-sample gene set enrichment analysis was carried out to evaluate immune cell infiltration and determine the associations with diagnostic markers. In addition, the causal biomarker genes were identified using Bayesian colocalization and the summary data-based Mendelian randomization (SMR) analysis. The combination of glmBoost and random forest machine learning analysis identified five hub genes (CXCL5, SERPINB2, SOCS3, PF4, and IL1R1), which demonstrated robust diagnostic performance for CD. These biomarkers were correlated with the immune cell infiltration patterns indicative of heightened inflammation and Th1/Th17 adaptive immune responses. Colocalization and SMR analyses established a causal association of IL1R1 with CD development. This integrative multiomics approach identified the key biomarkers involved in the pathogenic mechanism of CD. The eQTL data based SMR analysis suggested a significant association of IL1R1 with CD risk, highlighting its dual effects as a diagnostic biomarker and therapeutic target.

Abstract Image

Abstract Image

Abstract Image

通过整合机器学习、共定位和SMR分析鉴定克罗恩病的新生物标志物。
克罗恩病(CD)是一种慢性炎症性肠病,其患病率随着时间的推移而增加,因此需要改进诊断和治疗策略。目前的工作重点是确定CD诊断和治疗的候选生物标志物。基因表达Omnibus (GEO)衍生的cd相关基因表达数据集进行分析。通过差异蛋白-蛋白相互作用网络和加权基因共表达网络分析,确定核心候选基因的优先级。使用多种机器学习算法来进一步完善这些候选对象。使用SHapley加性解释解释了性能最高的模型的特征重要性。此外,还进行了单样本基因集富集分析,以评估免疫细胞浸润并确定与诊断标志物的关联。此外,利用贝叶斯共定位和基于汇总数据的孟德尔随机化(SMR)分析鉴定了致病生物标志物基因。glmBoost和随机森林机器学习分析的结合确定了5个中心基因(CXCL5、SERPINB2、SOCS3、PF4和IL1R1),这些生物标志物与免疫细胞浸润模式相关,表明炎症加剧和Th1/Th17适应性免疫反应。共定位和SMR分析确定了IL1R1与CD发展的因果关系。这种综合多组学方法确定了参与CD致病机制的关键生物标志物。基于eQTL数据的SMR分析表明,IL1R1与CD风险存在显著关联,突出了其作为诊断生物标志物和治疗靶点的双重作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
The FASEB Journal
The FASEB Journal 生物-生化与分子生物学
CiteScore
9.20
自引率
2.10%
发文量
6243
审稿时长
3 months
期刊介绍: The FASEB Journal publishes international, transdisciplinary research covering all fields of biology at every level of organization: atomic, molecular, cell, tissue, organ, organismic and population. While the journal strives to include research that cuts across the biological sciences, it also considers submissions that lie within one field, but may have implications for other fields as well. The journal seeks to publish basic and translational research, but also welcomes reports of pre-clinical and early clinical research. In addition to research, review, and hypothesis submissions, The FASEB Journal also seeks perspectives, commentaries, book reviews, and similar content related to the life sciences in its Up Front section.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书