Health State Risk Categorization: A Machine Learning Clustering Approach Using Health and Retirement Study Data

F. Tan, D. Mehta
{"title":"Health State Risk Categorization: A Machine Learning Clustering Approach Using Health and Retirement Study Data","authors":"F. Tan, D. Mehta","doi":"10.3905/jfds.2022.4.2.139","DOIUrl":null,"url":null,"abstract":"For countries such as the United States, which lacks a universal health care system, future health care costs can create significant uncertainty that a retirement investment strategy must be built to manage. One of the most important factors determining health care costs is the individual’s health status. Hence, categorizing individuals into meaningful health risk types is an essential task. The conventional approach is to use individuals’ self-rated health state categorization. In this work, the authors provide an objective and data-driven machine learning (ML)–based approach to categorize heath state risk by using the most widely used US household surveys on older Americans, the Health and Retirement Study (HRS). The authors propose an approach of employing the K-modes clustering method to algorithmically cluster on an exhaustive list of categorical health-related variables in the HRS. The resulting clusters are shown to provide an objective, interpretable, and practical health state risk categorization. The authors then compare and contrast the ML-based and self-rated health state categorizations and discuss the implications of the differences. They also illustrate the difficulty in predicting out-of-pocket costs based on self-rated health status and how ML-based categorizations can generate more-accurate health care cost estimates for personalized retirement planning. The results in this article open different avenues of research, including behavioral science analysis for health and retirement study.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Financial Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3905/jfds.2022.4.2.139","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

For countries such as the United States, which lacks a universal health care system, future health care costs can create significant uncertainty that a retirement investment strategy must be built to manage. One of the most important factors determining health care costs is the individual’s health status. Hence, categorizing individuals into meaningful health risk types is an essential task. The conventional approach is to use individuals’ self-rated health state categorization. In this work, the authors provide an objective and data-driven machine learning (ML)–based approach to categorize heath state risk by using the most widely used US household surveys on older Americans, the Health and Retirement Study (HRS). The authors propose an approach of employing the K-modes clustering method to algorithmically cluster on an exhaustive list of categorical health-related variables in the HRS. The resulting clusters are shown to provide an objective, interpretable, and practical health state risk categorization. The authors then compare and contrast the ML-based and self-rated health state categorizations and discuss the implications of the differences. They also illustrate the difficulty in predicting out-of-pocket costs based on self-rated health status and how ML-based categorizations can generate more-accurate health care cost estimates for personalized retirement planning. The results in this article open different avenues of research, including behavioral science analysis for health and retirement study.
健康状态风险分类:使用健康和退休研究数据的机器学习聚类方法
对于像美国这样缺乏全民医疗保健体系的国家来说,未来的医疗保健成本可能会产生巨大的不确定性,必须建立退休投资策略来管理。决定医疗费用的最重要因素之一是个人的健康状况。因此,将个体划分为有意义的健康风险类型是一项重要任务。传统的方法是使用个体自评健康状态分类。在这项工作中,作者提供了一种客观的、基于数据驱动的机器学习(ML)的方法,通过使用最广泛使用的美国老年人家庭调查,即健康与退休研究(HRS),对健康状态风险进行分类。作者提出了一种方法,采用k模式聚类方法,对HRS中与健康相关的分类变量的详尽列表进行算法聚类。结果显示,集群提供了一个客观的,可解释的,实用的健康状态风险分类。然后,作者比较和对比了基于ml的和自评的健康状态分类,并讨论了差异的含义。他们还说明了基于自评健康状况预测自付费用的困难,以及基于ml的分类如何为个性化退休计划生成更准确的医疗保健费用估计。本文的研究结果开辟了不同的研究途径,包括对健康和退休研究的行为科学分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信