Integration of transcriptomic analysis and multiple machine learning approaches identifies NAFLD progression-specific hub genes to reveal distinct genomic patterns and actionable targets

IF 8.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Jing Sun, Run Shi, Yang Wu, Yan Lou, Lijuan Nie, Chun Zhang, Yutian Cao, Qianhua Yan, Lifang Ye, Shu Zhang, Xuanbin Wang, Qibiao Wu, Xuehua Jiao, Jiangyi Yu, Zhuyuan Fang, Xiqiao Zhou
{"title":"Integration of transcriptomic analysis and multiple machine learning approaches identifies NAFLD progression-specific hub genes to reveal distinct genomic patterns and actionable targets","authors":"Jing Sun, Run Shi, Yang Wu, Yan Lou, Lijuan Nie, Chun Zhang, Yutian Cao, Qianhua Yan, Lifang Ye, Shu Zhang, Xuanbin Wang, Qibiao Wu, Xuehua Jiao, Jiangyi Yu, Zhuyuan Fang, Xiqiao Zhou","doi":"10.1186/s40537-024-00899-5","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Background</h3><p>Nonalcoholic fatty liver disease (NAFLD) is a leading public health problem worldwide. Approximately one fourth of patients with nonalcoholic fatty liver (NAFL) progress to nonalcoholic steatohepatitis (NASH), an advanced stage of NAFLD. Hence, there is an urgent need to make a better understanding of NAFLD heterogeneity and facilitate personalized management of high-risk NAFLD patients who may benefit from more intensive surveillance and preventive intervene.</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>In this study, a series of bioinformatic methods were performed to identify NAFLD progression-specific pathways and genes, and three machine learning approaches were combined to construct a risk-stratification gene signature to quantify risk assessment. In addition, bulk RNA-seq, single-cell RNA-seq (scRNA-seq) transcriptome profiling data and whole-exome sequencing (WES) data were comprehensively analyzed to reveal the genomic alterations and altered pathways between distinct molecular subtypes.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>Two distinct subtypes of NAFL were identified with the NAFLD progression-specific genes, and one subtype has a high similarity of the inflammatory pattern and fibrotic potential with NASH. The established risk-stratification gene signature could discriminate advanced samples from overall NAFLD. COL1A2, one key gene closely related to NAFLD progression, is specifically expressed in fibroblasts involved in hepatocellular carcinoma (HCC), and significantly correlated with EMT and angiogenesis in pan-cancer. Moreover, the β-catenin/COL1A2 axis might play a critical role in fibrosis severity and inflammatory response during NAFLD-HCC progression.</p><h3 data-test=\"abstract-sub-heading\">Conclusion</h3><p>In summary, our study provided evidence for the necessity of molecular classification and established a risk-stratification gene signature to quantify risk assessment of NAFLD, aiming to identify different risk subsets and to guide personalized treatment.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"2 1","pages":""},"PeriodicalIF":8.6000,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Big Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s40537-024-00899-5","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Nonalcoholic fatty liver disease (NAFLD) is a leading public health problem worldwide. Approximately one fourth of patients with nonalcoholic fatty liver (NAFL) progress to nonalcoholic steatohepatitis (NASH), an advanced stage of NAFLD. Hence, there is an urgent need to make a better understanding of NAFLD heterogeneity and facilitate personalized management of high-risk NAFLD patients who may benefit from more intensive surveillance and preventive intervene.

Methods

In this study, a series of bioinformatic methods were performed to identify NAFLD progression-specific pathways and genes, and three machine learning approaches were combined to construct a risk-stratification gene signature to quantify risk assessment. In addition, bulk RNA-seq, single-cell RNA-seq (scRNA-seq) transcriptome profiling data and whole-exome sequencing (WES) data were comprehensively analyzed to reveal the genomic alterations and altered pathways between distinct molecular subtypes.

Results

Two distinct subtypes of NAFL were identified with the NAFLD progression-specific genes, and one subtype has a high similarity of the inflammatory pattern and fibrotic potential with NASH. The established risk-stratification gene signature could discriminate advanced samples from overall NAFLD. COL1A2, one key gene closely related to NAFLD progression, is specifically expressed in fibroblasts involved in hepatocellular carcinoma (HCC), and significantly correlated with EMT and angiogenesis in pan-cancer. Moreover, the β-catenin/COL1A2 axis might play a critical role in fibrosis severity and inflammatory response during NAFLD-HCC progression.

Conclusion

In summary, our study provided evidence for the necessity of molecular classification and established a risk-stratification gene signature to quantify risk assessment of NAFLD, aiming to identify different risk subsets and to guide personalized treatment.

Abstract Image

整合转录组分析和多种机器学习方法,确定非酒精性脂肪肝进展特异性枢纽基因,揭示独特的基因组模式和可操作的靶点
背景非酒精性脂肪肝(NAFLD)是全球主要的公共健康问题。大约四分之一的非酒精性脂肪肝患者会发展为非酒精性脂肪性肝炎(NASH),这是非酒精性脂肪肝的晚期阶段。因此,迫切需要更好地了解非酒精性脂肪肝的异质性,并促进对高风险非酒精性脂肪肝患者的个性化管理,这些患者可能会从更密集的监测和预防性干预中获益。方法在这项研究中,采用了一系列生物信息学方法来识别非酒精性脂肪肝进展的特异性通路和基因,并结合三种机器学习方法构建了风险分级基因特征,以量化风险评估。结果通过非酒精性脂肪肝进展特异性基因确定了非酒精性脂肪肝的两个不同亚型,其中一个亚型的炎症模式和纤维化潜能与NASH高度相似。已建立的风险分级基因特征可将晚期样本与总体非酒精性脂肪肝区分开来。COL1A2是与NAFLD进展密切相关的一个关键基因,它在肝细胞癌(HCC)的成纤维细胞中特异性表达,并与泛癌中的EMT和血管生成显著相关。此外,β-catenin/COL1A2 轴可能在非酒精性脂肪肝-肝癌进展过程中的纤维化严重程度和炎症反应中发挥关键作用。 总之,我们的研究为分子分类的必要性提供了证据,并建立了风险分级基因特征,以量化非酒精性脂肪肝的风险评估,从而识别不同的风险亚群并指导个性化治疗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Big Data
Journal of Big Data Computer Science-Information Systems
CiteScore
17.80
自引率
3.70%
发文量
105
审稿时长
13 weeks
期刊介绍: The Journal of Big Data publishes high-quality, scholarly research papers, methodologies, and case studies covering a broad spectrum of topics, from big data analytics to data-intensive computing and all applications of big data research. It addresses challenges facing big data today and in the future, including data capture and storage, search, sharing, analytics, technologies, visualization, architectures, data mining, machine learning, cloud computing, distributed systems, and scalable storage. The journal serves as a seminal source of innovative material for academic researchers and practitioners alike.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信