Risk Prediction in Patients With Metabolic Dysfunction–Associated Steatohepatitis Using Natural Language Processing

Jordan Guillot , Christopher Y.K. Williams , Shadera Azzam , Balu Bhasuran , Gail Fernandes , Boshu Ru , Joe Yang , Xiao Zhang , R. Ravi Shankar , Jin Ge , Vivek A. Rudrapatna
{"title":"Risk Prediction in Patients With Metabolic Dysfunction–Associated Steatohepatitis Using Natural Language Processing","authors":"Jordan Guillot ,&nbsp;Christopher Y.K. Williams ,&nbsp;Shadera Azzam ,&nbsp;Balu Bhasuran ,&nbsp;Gail Fernandes ,&nbsp;Boshu Ru ,&nbsp;Joe Yang ,&nbsp;Xiao Zhang ,&nbsp;R. Ravi Shankar ,&nbsp;Jin Ge ,&nbsp;Vivek A. Rudrapatna","doi":"10.1016/j.gastha.2025.100701","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Aims</h3><div>Metabolic dysfunction–associated steatohepatitis (MASH) is a highly heterogenous condition and a leading cause of end-stage liver disease. Understanding disease progression in real-world settings remains a major unmet need. We sought to define a real-world MASH cohort using natural language processing (NLP) and identify significant associations with all-cause mortality and progression to cirrhosis and liver transplantation.</div></div><div><h3>Methods</h3><div>We developed, validated, and applied a novel NLP algorithm, “NASHDetection,” to identify patients at the University of California San Francisco who were diagnosed with MASH between 2012 and 2022. We used Cox regression with bidirectional stepwise variable selection to identify significant associations with outcomes.</div></div><div><h3>Results</h3><div>NASHDetection was 86% accurate at identifying 2695 MASH patients. At the time of their diagnosis, the median age was 57 years; 55.4% had cirrhosis at baseline, with 34.0% having evidence of decompensation and 10.8% with hepatocellular carcinoma. The most common comorbidities were hypertension (61.9%), hyperlipidemia (47.4%), and type 2 diabetes mellitus (41.5%). Multiple comorbidities were associated with all-cause mortality, including type 2 diabetes mellitus (hazard ratio (HR): 1.36; confidence interval (CI): 1.07–1.73), heart failure (HR: 1.45; CI: 1.01–2.08), and peripheral artery disease (HR: 1.72; CI: 1.04–2.85). Significant laboratory-based predictors of mortality included high–low-density lipoprotein cholesterol (HR: 1.49; CI: 1.20–1.84) and high alkaline phosphatase (HR: 1.94; CI: 1.58–2.38).</div></div><div><h3>Conclusion</h3><div>We described a cohort of real-world MASH patients using a new NLP algorithm and found several potential predictors of progression to all-cause mortality, cirrhosis, and liver transplantation. The use of NLP to characterize these patients can help support the development of future interventional trials in MASH.</div></div>","PeriodicalId":73130,"journal":{"name":"Gastro hep advances","volume":"4 9","pages":"Article 100701"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Gastro hep advances","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772572325000883","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background and Aims

Metabolic dysfunction–associated steatohepatitis (MASH) is a highly heterogenous condition and a leading cause of end-stage liver disease. Understanding disease progression in real-world settings remains a major unmet need. We sought to define a real-world MASH cohort using natural language processing (NLP) and identify significant associations with all-cause mortality and progression to cirrhosis and liver transplantation.

Methods

We developed, validated, and applied a novel NLP algorithm, “NASHDetection,” to identify patients at the University of California San Francisco who were diagnosed with MASH between 2012 and 2022. We used Cox regression with bidirectional stepwise variable selection to identify significant associations with outcomes.

Results

NASHDetection was 86% accurate at identifying 2695 MASH patients. At the time of their diagnosis, the median age was 57 years; 55.4% had cirrhosis at baseline, with 34.0% having evidence of decompensation and 10.8% with hepatocellular carcinoma. The most common comorbidities were hypertension (61.9%), hyperlipidemia (47.4%), and type 2 diabetes mellitus (41.5%). Multiple comorbidities were associated with all-cause mortality, including type 2 diabetes mellitus (hazard ratio (HR): 1.36; confidence interval (CI): 1.07–1.73), heart failure (HR: 1.45; CI: 1.01–2.08), and peripheral artery disease (HR: 1.72; CI: 1.04–2.85). Significant laboratory-based predictors of mortality included high–low-density lipoprotein cholesterol (HR: 1.49; CI: 1.20–1.84) and high alkaline phosphatase (HR: 1.94; CI: 1.58–2.38).

Conclusion

We described a cohort of real-world MASH patients using a new NLP algorithm and found several potential predictors of progression to all-cause mortality, cirrhosis, and liver transplantation. The use of NLP to characterize these patients can help support the development of future interventional trials in MASH.
使用自然语言处理预测代谢功能障碍相关脂肪性肝炎患者的风险
背景和目的代谢功能障碍相关脂肪性肝炎(MASH)是一种高度异质性的疾病,是终末期肝脏疾病的主要原因。了解现实环境中的疾病进展仍然是一个主要的未满足需求。我们试图使用自然语言处理(NLP)定义一个真实世界的MASH队列,并确定其与全因死亡率、肝硬化进展和肝移植的显著关联。我们开发、验证并应用了一种新的NLP算法“NASHDetection”,用于识别2012年至2022年间在加州大学旧金山分校被诊断为MASH的患者。我们使用Cox回归和双向逐步变量选择来确定与结果的显著关联。结果snashdetection对2695例MASH患者的识别准确率为86%。确诊时,患者的中位年龄为57岁;55.4%基线时有肝硬化,34.0%有代偿失代偿证据,10.8%有肝细胞癌。最常见的合并症是高血压(61.9%)、高脂血症(47.4%)和2型糖尿病(41.5%)。多种合并症与全因死亡率相关,包括2型糖尿病(危险比(HR): 1.36;置信区间(CI): 1.07-1.73)、心力衰竭(HR: 1.45;CI: 1.01-2.08)和外周动脉疾病(HR: 1.72;置信区间:1.04—-2.85)。基于实验室的重要死亡率预测因子包括高低密度脂蛋白胆固醇(HR: 1.49;CI: 1.20-1.84)和高碱性磷酸酶(HR: 1.94;置信区间:1.58—-2.38)。我们使用一种新的NLP算法描述了一组现实世界的MASH患者,并发现了几个潜在的全因死亡率、肝硬化和肝移植进展的预测因素。使用NLP来描述这些患者的特征可以帮助支持未来MASH介入试验的发展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Gastro hep advances
Gastro hep advances Gastroenterology
CiteScore
0.80
自引率
0.00%
发文量
0
审稿时长
64 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信