Risk adjustment for regional healthcare funding allocations with ensemble methods: an empirical study and interpretation.

IF 3.1 3区医学 Q1 ECONOMICS

European Journal of Health Economics Pub Date : 2024-09-01 Epub Date: 2024-01-03 DOI:10.1007/s10198-023-01656-w

Tuukka Holster, Shaoxiong Ji, Pekka Marttinen

{"title":"Risk adjustment for regional healthcare funding allocations with ensemble methods: an empirical study and interpretation.","authors":"Tuukka Holster, Shaoxiong Ji, Pekka Marttinen","doi":"10.1007/s10198-023-01656-w","DOIUrl":null,"url":null,"abstract":"<p><p>We experiment with recent ensemble machine learning methods in estimating healthcare costs, utilizing Finnish data containing rich individual-level information on healthcare costs, socioeconomic status and diagnostic data from multiple registries. Our data are a random 10% sample (553,675 observations) from the Finnish population in 2017. Using annual healthcare cost in 2017 as a response variable, we compare the performance of Random forest, Gradient Boosting Machine (GBM) and eXtreme Gradient Boosting (XGBoost) to linear regression. As machine learning methods are often seen as unsuitable in risk adjustment applications because of their relative opaqueness, we also introduce visualizations from the machine learning literature to help interpret the contribution of individual variables to the prediction. Our results show that ensemble machine learning methods can improve predictive performance, with all of them significantly outperforming linear regression, and that a certain level of interpretation can be provided for them. We also find individual-level socioeconomic variables to improve prediction accuracy and that their effect is larger for machine learning methods. However, we find that the predictions used for funding allocations are sensitive to model selection, highlighting the need for comprehensive robustness testing when estimating risk adjustment models used in applications.</p>","PeriodicalId":51416,"journal":{"name":"European Journal of Health Economics","volume":" ","pages":"1117-1131"},"PeriodicalIF":3.1000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11377675/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Health Economics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10198-023-01656-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/3 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}

引用次数: 0

Abstract

We experiment with recent ensemble machine learning methods in estimating healthcare costs, utilizing Finnish data containing rich individual-level information on healthcare costs, socioeconomic status and diagnostic data from multiple registries. Our data are a random 10% sample (553,675 observations) from the Finnish population in 2017. Using annual healthcare cost in 2017 as a response variable, we compare the performance of Random forest, Gradient Boosting Machine (GBM) and eXtreme Gradient Boosting (XGBoost) to linear regression. As machine learning methods are often seen as unsuitable in risk adjustment applications because of their relative opaqueness, we also introduce visualizations from the machine learning literature to help interpret the contribution of individual variables to the prediction. Our results show that ensemble machine learning methods can improve predictive performance, with all of them significantly outperforming linear regression, and that a certain level of interpretation can be provided for them. We also find individual-level socioeconomic variables to improve prediction accuracy and that their effect is larger for machine learning methods. However, we find that the predictions used for funding allocations are sensitive to model selection, highlighting the need for comprehensive robustness testing when estimating risk adjustment models used in applications.

Abstract Image

查看原文本刊更多论文

用集合方法对地区医疗资金分配进行风险调整：实证研究与解释。

我们利用芬兰的数据，其中包含丰富的个人层面信息，包括医疗成本、社会经济状况以及来自多个登记处的诊断数据，尝试使用最新的集合机器学习方法来估算医疗成本。我们的数据是从 2017 年芬兰人口中随机抽取的 10%样本（553,675 个观测值）。以 2017 年的年度医疗成本作为响应变量，我们比较了随机森林、梯度提升机（GBM）和极端梯度提升（XGBoost）与线性回归的性能。由于机器学习方法相对不透明，通常被认为不适合风险调整应用，因此我们还引入了机器学习文献中的可视化方法，以帮助解释各个变量对预测的贡献。我们的研究结果表明，集合式机器学习方法可以提高预测性能，所有这些方法的性能都明显优于线性回归，而且可以对这些方法进行一定程度的解释。我们还发现，个人层面的社会经济变量也能提高预测准确性，而且它们对机器学习方法的影响更大。不过，我们发现用于资金分配的预测对模型选择很敏感，这突出表明在估算应用中使用的风险调整模型时需要进行全面的稳健性测试。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

European Journal of Health Economics Multiple-

CiteScore

6.10

自引率

2.30%

发文量

131

期刊介绍： The European Journal of Health Economics is a journal of Health Economics and associated disciplines. The growing demand for health economics and the introduction of new guidelines in various European countries were the motivation to generate a highly scientific and at the same time practice oriented journal considering the requirements of various health care systems in Europe. The international scientific board of opinion leaders guarantees high-quality, peer-reviewed publications as well as articles for pragmatic approaches in the field of health economics. We intend to cover all aspects of health economics: • Basics of health economic approaches and methods • Pharmacoeconomics • Health Care Systems • Pricing and Reimbursement Systems • Quality-of-Life-Studies The editors reserve the right to reject manuscripts that do not comply with the above-mentioned requirements. The author will be held responsible for false statements or for failure to fulfill the above-mentioned requirements. Officially cited as: Eur J Health Econ