Son Nguyen, Alicia T. Lamere, A. Olinsky, John T. Quinn
{"title":"The Effects of Sampling Methods on Machine Learning Models for Predicting Long-term Length of Stay: A Case Study of Rhode Island Hospitals","authors":"Son Nguyen, Alicia T. Lamere, A. Olinsky, John T. Quinn","doi":"10.4018/ijrsda.2019070103","DOIUrl":null,"url":null,"abstract":"The ability to predict the patients with long-term length of stay (LOS) can aid a hospital's admission management, maintain effective resource utilization and provide a high quality of inpatient care. Hospital discharge data from the Rhode Island Department of Health from the time period between 2010 to 2013 reveals that inpatients with long-term stays, i.e. two weeks or more, costs about six times more than those with short stays while only accounting for 4.7% of the inpatients. With the imbalance in the distribution of long-stay patients and short-stay patients, predicting long-term LOS patients becomes an imbalanced classification problem. Sampling methods—balancing the data before fitting it to a traditional classification model—offer a simple approach to the problem. In this work, the authors propose a new resampling method called RUBIES which provides superior predictive ability when compared to other commonly used sampling techniques.","PeriodicalId":152357,"journal":{"name":"Int. J. Rough Sets Data Anal.","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Rough Sets Data Anal.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijrsda.2019070103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
The ability to predict the patients with long-term length of stay (LOS) can aid a hospital's admission management, maintain effective resource utilization and provide a high quality of inpatient care. Hospital discharge data from the Rhode Island Department of Health from the time period between 2010 to 2013 reveals that inpatients with long-term stays, i.e. two weeks or more, costs about six times more than those with short stays while only accounting for 4.7% of the inpatients. With the imbalance in the distribution of long-stay patients and short-stay patients, predicting long-term LOS patients becomes an imbalanced classification problem. Sampling methods—balancing the data before fitting it to a traditional classification model—offer a simple approach to the problem. In this work, the authors propose a new resampling method called RUBIES which provides superior predictive ability when compared to other commonly used sampling techniques.