Prognostic Modeling for Liver Cirrhosis Mortality Prediction and Real-Time Health Monitoring from Electronic Health Data.

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data Pub Date : 2024-12-09 DOI:10.1089/big.2024.0071

Chengping Zhang, Muhammad Faisal Buland Iqbal, Imran Iqbal, Minghao Cheng, Nadia Sarhan, Emad Mahrous Awwad, Yazeed Yasin Ghadi

{"title":"Prognostic Modeling for Liver Cirrhosis Mortality Prediction and Real-Time Health Monitoring from Electronic Health Data.","authors":"Chengping Zhang, Muhammad Faisal Buland Iqbal, Imran Iqbal, Minghao Cheng, Nadia Sarhan, Emad Mahrous Awwad, Yazeed Yasin Ghadi","doi":"10.1089/big.2024.0071","DOIUrl":null,"url":null,"abstract":"<p><p>Liver cirrhosis stands as a prominent contributor to mortality, impacting millions across the United States. Enabling health care providers to predict early mortality among patients with cirrhosis holds the potential to enhance treatment efficacy significantly. Our hypothesis centers on the correlation between mortality and laboratory test results along with relevant diagnoses in this patient cohort. Additionally, we posit that a deep learning model could surpass the predictive capabilities of the existing Model for End-Stage Liver Disease score. This research seeks to advance prognostic accuracy and refine approaches to address the critical challenges posed by cirrhosis-related mortality. This study evaluates the performance of an artificial neural network model for liver disease classification using various training dataset sizes. Through meticulous experimentation, three distinct training proportions were analyzed: 70%, 80%, and 90%. The model's efficacy was assessed using precision, recall, F1-score, accuracy, and support metrics, alongside receiver operating characteristic (ROC) and precision-recall (PR) curves. The ROC curves were quantified using the area under the curve (AUC) metric. Results indicated that the model's performance improved with an increased size of the training dataset. Specifically, the 80% training data model achieved the highest AUC, suggesting superior classification ability over the models trained with 70% and 90% data. PR analysis revealed a steep trade-off between precision and recall across all datasets, with 80% training data again demonstrating a slightly better balance. This is indicative of the challenges faced in achieving high precision with a concurrently high recall, a common issue in imbalanced datasets such as those found in medical diagnostics.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1089/big.2024.0071","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Liver cirrhosis stands as a prominent contributor to mortality, impacting millions across the United States. Enabling health care providers to predict early mortality among patients with cirrhosis holds the potential to enhance treatment efficacy significantly. Our hypothesis centers on the correlation between mortality and laboratory test results along with relevant diagnoses in this patient cohort. Additionally, we posit that a deep learning model could surpass the predictive capabilities of the existing Model for End-Stage Liver Disease score. This research seeks to advance prognostic accuracy and refine approaches to address the critical challenges posed by cirrhosis-related mortality. This study evaluates the performance of an artificial neural network model for liver disease classification using various training dataset sizes. Through meticulous experimentation, three distinct training proportions were analyzed: 70%, 80%, and 90%. The model's efficacy was assessed using precision, recall, F1-score, accuracy, and support metrics, alongside receiver operating characteristic (ROC) and precision-recall (PR) curves. The ROC curves were quantified using the area under the curve (AUC) metric. Results indicated that the model's performance improved with an increased size of the training dataset. Specifically, the 80% training data model achieved the highest AUC, suggesting superior classification ability over the models trained with 70% and 90% data. PR analysis revealed a steep trade-off between precision and recall across all datasets, with 80% training data again demonstrating a slightly better balance. This is indicative of the challenges faced in achieving high precision with a concurrently high recall, a common issue in imbalanced datasets such as those found in medical diagnostics.

查看原文本刊更多论文

基于电子健康数据的肝硬化死亡率预测和实时健康监测的预后建模。

肝硬化是导致死亡的一个重要因素，影响着美国数百万人。使卫生保健提供者能够预测肝硬化患者的早期死亡率，具有显著提高治疗效果的潜力。我们的假设集中在死亡率与实验室检测结果以及该患者队列的相关诊断之间的相关性。此外，我们假设深度学习模型可以超越现有终末期肝病评分模型的预测能力。本研究旨在提高预后准确性和改进方法，以解决肝硬化相关死亡率带来的关键挑战。本研究使用不同的训练数据集大小来评估肝脏疾病分类的人工神经网络模型的性能。通过细致的实验，分析了三种不同的训练比例：70%、80%和90%。采用精密度、召回率、f1评分、准确度和支持度指标，以及受试者工作特征（ROC）和精确召回率（PR）曲线来评估模型的有效性。ROC曲线采用曲线下面积（AUC）指标进行量化。结果表明，模型的性能随着训练数据集大小的增加而提高。具体来说，80%训练数据模型的AUC最高，表明其分类能力优于70%和90%训练数据模型。PR分析揭示了所有数据集的准确率和召回率之间的巨大权衡，80%的训练数据再次显示出稍好的平衡。这表明在实现高精度和高召回率方面所面临的挑战，这是医疗诊断等不平衡数据集中的常见问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Big Data COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-COMPUTER SCIENCE, THEORY & METHODS

CiteScore

9.10

自引率

2.20%

发文量

期刊介绍： Big Data is the leading peer-reviewed journal covering the challenges and opportunities in collecting, analyzing, and disseminating vast amounts of data. The Journal addresses questions surrounding this powerful and growing field of data science and facilitates the efforts of researchers, business managers, analysts, developers, data scientists, physicists, statisticians, infrastructure developers, academics, and policymakers to improve operations, profitability, and communications within their businesses and institutions. Spanning a broad array of disciplines focusing on novel big data technologies, policies, and innovations, the Journal brings together the community to address current challenges and enforce effective efforts to organize, store, disseminate, protect, manipulate, and, most importantly, find the most effective strategies to make this incredible amount of information work to benefit society, industry, academia, and government. Big Data coverage includes: Big data industry standards, New technologies being developed specifically for big data, Data acquisition, cleaning, distribution, and best practices, Data protection, privacy, and policy, Business interests from research to product, The changing role of business intelligence, Visualization and design principles of big data infrastructures, Physical interfaces and robotics, Social networking advantages for Facebook, Twitter, Amazon, Google, etc, Opportunities around big data and how companies can harness it to their advantage.