利用医疗保险索赔数据预测住院时间的分类方法评价

Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments Pub Date : 2014-05-27 DOI:10.1145/2674396.2674430

D. Zikos, K. Tsiakas, Fadiah Qudah, V. Athitsos, F. Makedon

{"title":"利用医疗保险索赔数据预测住院时间的分类方法评价","authors":"D. Zikos, K. Tsiakas, Fadiah Qudah, V. Athitsos, F. Makedon","doi":"10.1145/2674396.2674430","DOIUrl":null,"url":null,"abstract":"In this paper, we investigate the performance of a series of classification methods for the prediction of the hospital Length of Stay (LOS), based on two temporally sequential clinical scenarios. We used a 2012 Medicare Provider Analysis and Review (MedPar) dataset, which contains records of Medicare beneficiaries who used inpatient hospital services. Our subset included 300,000 randomly selected cases. During the prepossessing we added new features and linked our data with external datasets, using common key identifiers. In the first scenario our goal was to predict the LOS using a subset of information which is readily available to the clinician upon the patient admission, while the second scenario assumes that there is available additional data (information on the patient diagnosis and clinical procedures). For our experiments we used three different classifiers: Naïve Bayes, AdaBoost and C4.5 Decision tree, for two different LOS cut-off points (4 day and 12 day hospital stay). The overall performance of our classifiers was ranging from fair to very good. On the other hand the true positive rate, that is the correct classification of the long hospital stays, was low, with an exception of Naïve Bayes, which demonstrated significantly better performance in the second scenario. Our results indicate that Naïve Bayes may be used for the prediction of the in-hospital LOS. Our analysis also indicates that the MedPar data combined with other data resources has the potential to provide a good basis for robust prediction analytics in hospitals.","PeriodicalId":192421,"journal":{"name":"Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Evaluation of classification methods for the prediction of hospital length of stay using medicare claims data\",\"authors\":\"D. Zikos, K. Tsiakas, Fadiah Qudah, V. Athitsos, F. Makedon\",\"doi\":\"10.1145/2674396.2674430\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we investigate the performance of a series of classification methods for the prediction of the hospital Length of Stay (LOS), based on two temporally sequential clinical scenarios. We used a 2012 Medicare Provider Analysis and Review (MedPar) dataset, which contains records of Medicare beneficiaries who used inpatient hospital services. Our subset included 300,000 randomly selected cases. During the prepossessing we added new features and linked our data with external datasets, using common key identifiers. In the first scenario our goal was to predict the LOS using a subset of information which is readily available to the clinician upon the patient admission, while the second scenario assumes that there is available additional data (information on the patient diagnosis and clinical procedures). For our experiments we used three different classifiers: Naïve Bayes, AdaBoost and C4.5 Decision tree, for two different LOS cut-off points (4 day and 12 day hospital stay). The overall performance of our classifiers was ranging from fair to very good. On the other hand the true positive rate, that is the correct classification of the long hospital stays, was low, with an exception of Naïve Bayes, which demonstrated significantly better performance in the second scenario. Our results indicate that Naïve Bayes may be used for the prediction of the in-hospital LOS. Our analysis also indicates that the MedPar data combined with other data resources has the potential to provide a good basis for robust prediction analytics in hospitals.\",\"PeriodicalId\":192421,\"journal\":{\"name\":\"Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2674396.2674430\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2674396.2674430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

在本文中，我们研究了一系列分类方法的性能预测住院时间(LOS)，基于两个时间顺序的临床场景。我们使用了2012年医疗保险提供者分析和审查(MedPar)数据集，其中包含使用住院医院服务的医疗保险受益人的记录。我们的子集包括30万个随机选择的病例。在改进过程中，我们添加了新功能，并使用通用键标识符将我们的数据与外部数据集链接起来。在第一个场景中，我们的目标是使用临床医生在患者入院时随时可用的信息子集来预测LOS，而第二个场景假设存在可用的附加数据(关于患者诊断和临床程序的信息)。对于我们的实验，我们使用了三种不同的分类器:Naïve贝叶斯，AdaBoost和C4.5决策树，用于两个不同的LOS分界点(住院时间4天和12天)。我们的分类器的总体性能从一般到非常好。另一方面，真正的阳性率，即长期住院的正确分类，很低，除了Naïve贝叶斯，它在第二种情况下表现出明显更好的性能。我们的研究结果表明Naïve贝叶斯可以用于院内LOS的预测。我们的分析还表明，MedPar数据与其他数据资源相结合，有可能为医院的稳健预测分析提供良好的基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Evaluation of classification methods for the prediction of hospital length of stay using medicare claims data

In this paper, we investigate the performance of a series of classification methods for the prediction of the hospital Length of Stay (LOS), based on two temporally sequential clinical scenarios. We used a 2012 Medicare Provider Analysis and Review (MedPar) dataset, which contains records of Medicare beneficiaries who used inpatient hospital services. Our subset included 300,000 randomly selected cases. During the prepossessing we added new features and linked our data with external datasets, using common key identifiers. In the first scenario our goal was to predict the LOS using a subset of information which is readily available to the clinician upon the patient admission, while the second scenario assumes that there is available additional data (information on the patient diagnosis and clinical procedures). For our experiments we used three different classifiers: Naïve Bayes, AdaBoost and C4.5 Decision tree, for two different LOS cut-off points (4 day and 12 day hospital stay). The overall performance of our classifiers was ranging from fair to very good. On the other hand the true positive rate, that is the correct classification of the long hospital stays, was low, with an exception of Naïve Bayes, which demonstrated significantly better performance in the second scenario. Our results indicate that Naïve Bayes may be used for the prediction of the in-hospital LOS. Our analysis also indicates that the MedPar data combined with other data resources has the potential to provide a good basis for robust prediction analytics in hospitals.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments

自引率

0.00%

发文量