Machine learning based prediction tool of hospitalization cost

B. Abdelmoula, M. Khemakhem, N. Abdelmoula
{"title":"Machine learning based prediction tool of hospitalization cost","authors":"B. Abdelmoula, M. Khemakhem, N. Abdelmoula","doi":"10.1109/acit53391.2021.9677110","DOIUrl":null,"url":null,"abstract":"The increase in the cost of healthcare is a worldwide challenge. It has thus become essential to understand the nature and the weight of the factors that influence it and to foresee its future changes in order to ensure good governance, improve hospital management of material and financial resources and therefore be ready to face emergency situations such as the ongoing global pandemic. Using Python programming language, different supervised machine learning algorithms, were tested on a dataset extracted from digital medical records of hospitalized patients in the infectious diseases department at Sfax university hospital (Tunisia). Different models for predicting the hospitalization cost of a patient upon admission were created and evaluated after having processed and analyzed the collected data. This dataset initially comprised 542 observations and 136 variables including 36 quantitative ones and 100 dummy variables. Two variable selection methods were applied and subgroups of independent variables with different semantic meanings were also used. Despite few shortcomings such as missing data, the most precise of the different tested prediction models was that of 15th degree multiple linear regression. Regressors were the season of the period of hospitalization, suspected diagnosis and patient characteristics such as gender. When applied in reality, this tool would make it possible to predict the hospitalization cost and therefore forecast precise budgets. However, technical improvements remain to be made in order to optimize the quality of this tool and other algorithms could be tested to further broaden this study. The generalization of the implementation and use of well-developed digital medical records would allow the production of more complete databases from which better prediction models could be generated.","PeriodicalId":302120,"journal":{"name":"2021 22nd International Arab Conference on Information Technology (ACIT)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 22nd International Arab Conference on Information Technology (ACIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/acit53391.2021.9677110","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The increase in the cost of healthcare is a worldwide challenge. It has thus become essential to understand the nature and the weight of the factors that influence it and to foresee its future changes in order to ensure good governance, improve hospital management of material and financial resources and therefore be ready to face emergency situations such as the ongoing global pandemic. Using Python programming language, different supervised machine learning algorithms, were tested on a dataset extracted from digital medical records of hospitalized patients in the infectious diseases department at Sfax university hospital (Tunisia). Different models for predicting the hospitalization cost of a patient upon admission were created and evaluated after having processed and analyzed the collected data. This dataset initially comprised 542 observations and 136 variables including 36 quantitative ones and 100 dummy variables. Two variable selection methods were applied and subgroups of independent variables with different semantic meanings were also used. Despite few shortcomings such as missing data, the most precise of the different tested prediction models was that of 15th degree multiple linear regression. Regressors were the season of the period of hospitalization, suspected diagnosis and patient characteristics such as gender. When applied in reality, this tool would make it possible to predict the hospitalization cost and therefore forecast precise budgets. However, technical improvements remain to be made in order to optimize the quality of this tool and other algorithms could be tested to further broaden this study. The generalization of the implementation and use of well-developed digital medical records would allow the production of more complete databases from which better prediction models could be generated.
基于机器学习的住院费用预测工具
医疗保健费用的增加是一个全球性的挑战。因此,必须了解影响它的因素的性质和重要性,并预见其未来的变化,以确保善政,改善医院对物质和财政资源的管理,从而准备好面对诸如目前的全球大流行病等紧急情况。使用Python编程语言,在从Sfax大学医院(突尼斯)传染病科住院患者的数字医疗记录中提取的数据集上测试了不同的监督机器学习算法。在对收集到的数据进行处理和分析后,创建并评估了用于预测患者入院时住院费用的不同模型。该数据集最初包括542个观测值和136个变量,其中36个定量变量和100个虚拟变量。采用了两种变量选择方法,并使用了不同语义的自变量子组。尽管有一些缺点,如缺少数据,但不同测试的预测模型中最精确的是15度多元线性回归。回归因子为住院季节、疑似诊断和患者特征(如性别)。在实际应用时,该工具可以预测住院费用,从而预测精确的预算。然而,为了优化该工具的质量,技术改进仍有待进行,其他算法也可以进行测试,以进一步扩大这项研究。推广完善的数字医疗记录的实施和使用将有助于建立更完整的数据库,从而产生更好的预测模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信