Weak-Supervision for Prolonged Hospital Length of Stay Prediction

2022 IEEE International Conference on E-health Networking, Application & Services (HealthCom) Pub Date : 2022-10-17 DOI:10.1109/HealthCom54947.2022.9982748

Ariana J. Mann, N. Bambos

{"title":"Weak-Supervision for Prolonged Hospital Length of Stay Prediction","authors":"Ariana J. Mann, N. Bambos","doi":"10.1109/HealthCom54947.2022.9982748","DOIUrl":null,"url":null,"abstract":"Predicting whether a patient will have a prolonged length of stay (LoS) once admitted to a hospital can help ensure medical resources are allocated to where they are needed most. However, prior works on classifying prolonged-LoS patients define a prolonged-LoS as being greater than a single, flat number-of-days cutoff. Using a flat cutoff, means that the classification occurs without reference to a baseline LoS, fails to control for any covariates, and is generally only effective for a specific medical subgroup. Instead, in this work, we introduce an approach where the algorithm designer specifies a LoS percentile that should be used as the cutoff for prolonged-LoS. In a method known as weak-supervision, we use the LoS percentile cutoff to train a model to produce the actual labels for classification machine learning training. Contrary to a number-of-days cutoff, the LoS percentile cutoff coupled with weak-supervision, provides what we claim is a more principled and flexible approach to defining what constitutes a prolonged-LoS.Specifically, we train a quantile regression model to predict the designated LoS percentile value for each patient, which importantly allows us to control for covariates that access to medical care should be equalized across (such as primary medical condition, hospital facility, and admission time of day). The regression output is cast as a noisy binary label for prolonged-LoS, which is then used to train a machine learning model for prolonged-LoS classification. We empirically demonstrate that this weak-supervision based approach provides usable classification performance despite using noisy labels.","PeriodicalId":202664,"journal":{"name":"2022 IEEE International Conference on E-health Networking, Application & Services (HealthCom)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on E-health Networking, Application & Services (HealthCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HealthCom54947.2022.9982748","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Predicting whether a patient will have a prolonged length of stay (LoS) once admitted to a hospital can help ensure medical resources are allocated to where they are needed most. However, prior works on classifying prolonged-LoS patients define a prolonged-LoS as being greater than a single, flat number-of-days cutoff. Using a flat cutoff, means that the classification occurs without reference to a baseline LoS, fails to control for any covariates, and is generally only effective for a specific medical subgroup. Instead, in this work, we introduce an approach where the algorithm designer specifies a LoS percentile that should be used as the cutoff for prolonged-LoS. In a method known as weak-supervision, we use the LoS percentile cutoff to train a model to produce the actual labels for classification machine learning training. Contrary to a number-of-days cutoff, the LoS percentile cutoff coupled with weak-supervision, provides what we claim is a more principled and flexible approach to defining what constitutes a prolonged-LoS.Specifically, we train a quantile regression model to predict the designated LoS percentile value for each patient, which importantly allows us to control for covariates that access to medical care should be equalized across (such as primary medical condition, hospital facility, and admission time of day). The regression output is cast as a noisy binary label for prolonged-LoS, which is then used to train a machine learning model for prolonged-LoS classification. We empirically demonstrate that this weak-supervision based approach provides usable classification performance despite using noisy labels.

查看原文本刊更多论文

对延长住院时间预测的弱监管

预测患者入院后是否会有较长的住院时间(LoS)，可以帮助确保将医疗资源分配到最需要的地方。然而，先前对延长的los患者进行分类的工作将延长的los定义为大于单一的，平坦的天数截止。使用平截止意味着在不参考基线LoS的情况下进行分类，无法控制任何协变量，并且通常仅对特定的医疗亚组有效。相反，在这项工作中，我们引入了一种方法，其中算法设计者指定一个LoS百分位数，该百分位数应用作延长LoS的截止点。在一种被称为弱监督的方法中，我们使用LoS百分位数截断来训练一个模型，以产生用于分类机器学习训练的实际标签。与天数限制相反，LoS百分比限制与弱监督相结合，提供了我们所称的更有原则性和更灵活的方法来定义什么是长期LoS。具体来说，我们训练了一个分位数回归模型来预测每个患者指定的LoS百分位数值，这重要的是允许我们控制应该均衡获得医疗服务的共变量(例如基本医疗条件、医院设施和入院时间)。将回归输出转换为长时间los的噪声二元标签，然后将其用于训练用于长时间los分类的机器学习模型。我们的经验证明，尽管使用了噪声标签，这种基于弱监督的方法仍然提供了可用的分类性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on E-health Networking, Application & Services (HealthCom)

自引率

0.00%

发文量