评估模型预测性能:医疗保险欺诈检测案例研究

Richard A. Bauder, Matthew Herland, T. Khoshgoftaar
{"title":"评估模型预测性能:医疗保险欺诈检测案例研究","authors":"Richard A. Bauder, Matthew Herland, T. Khoshgoftaar","doi":"10.1109/IRI.2019.00016","DOIUrl":null,"url":null,"abstract":"Evaluating a machine learning model's predictive performance is vital for establishing the practical usability in real-world applications. The use of separate training and test datasets, and cross-validation are common when evaluating machine learning models. The former uses two distinct datasets, whereas cross-validation splits a single dataset into smaller training and test subsets. In real-world production applications, it is critical to establish a model's usefulness by validating it on completely new input data, and not just using the crossvalidation results on a single historical dataset. In this paper, we present results for both evaluation methods, to include performance comparisons. In order to provide meaningful comparative analyses between methods, we perform real-world fraud detection experiments using 2013 to 2016 Medicare durable medical equipment claims data. This Medicare dataset is split into training (2013 to 2015 individual years) and test (2016 only). Using this Medicare case study, we assess the fraud detection performance, across three learners, for both model evaluation methods. We find that using the separate training and test sets generally outperforms cross-validation, indicating a better real-world model performance evaluation. Even so, cross-validation has comparable, but conservative, fraud detection results.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"27 34","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Evaluating Model Predictive Performance: A Medicare Fraud Detection Case Study\",\"authors\":\"Richard A. Bauder, Matthew Herland, T. Khoshgoftaar\",\"doi\":\"10.1109/IRI.2019.00016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Evaluating a machine learning model's predictive performance is vital for establishing the practical usability in real-world applications. The use of separate training and test datasets, and cross-validation are common when evaluating machine learning models. The former uses two distinct datasets, whereas cross-validation splits a single dataset into smaller training and test subsets. In real-world production applications, it is critical to establish a model's usefulness by validating it on completely new input data, and not just using the crossvalidation results on a single historical dataset. In this paper, we present results for both evaluation methods, to include performance comparisons. In order to provide meaningful comparative analyses between methods, we perform real-world fraud detection experiments using 2013 to 2016 Medicare durable medical equipment claims data. This Medicare dataset is split into training (2013 to 2015 individual years) and test (2016 only). Using this Medicare case study, we assess the fraud detection performance, across three learners, for both model evaluation methods. We find that using the separate training and test sets generally outperforms cross-validation, indicating a better real-world model performance evaluation. Even so, cross-validation has comparable, but conservative, fraud detection results.\",\"PeriodicalId\":295028,\"journal\":{\"name\":\"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)\",\"volume\":\"27 34\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IRI.2019.00016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2019.00016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

评估机器学习模型的预测性能对于在实际应用中建立实际可用性至关重要。在评估机器学习模型时,使用单独的训练和测试数据集以及交叉验证是很常见的。前者使用两个不同的数据集,而交叉验证将单个数据集分成更小的训练和测试子集。在实际的生产应用程序中,通过在全新的输入数据上验证模型,而不仅仅是在单个历史数据集上使用交叉验证结果,从而建立模型的有用性是至关重要的。在本文中,我们提出了两种评估方法的结果,包括性能比较。为了在方法之间提供有意义的比较分析,我们使用2013年至2016年医疗保险耐用医疗设备索赔数据进行了真实的欺诈检测实验。该医疗保险数据集分为培训(2013年至2015年个人年)和测试(仅2016年)。使用这个医疗保险案例研究,我们评估欺诈检测性能,跨越三个学习者,为两种模型评估方法。我们发现使用单独的训练集和测试集通常优于交叉验证,表明更好的真实世界模型性能评估。即便如此,交叉验证的欺诈检测结果具有可比性,但较为保守。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating Model Predictive Performance: A Medicare Fraud Detection Case Study
Evaluating a machine learning model's predictive performance is vital for establishing the practical usability in real-world applications. The use of separate training and test datasets, and cross-validation are common when evaluating machine learning models. The former uses two distinct datasets, whereas cross-validation splits a single dataset into smaller training and test subsets. In real-world production applications, it is critical to establish a model's usefulness by validating it on completely new input data, and not just using the crossvalidation results on a single historical dataset. In this paper, we present results for both evaluation methods, to include performance comparisons. In order to provide meaningful comparative analyses between methods, we perform real-world fraud detection experiments using 2013 to 2016 Medicare durable medical equipment claims data. This Medicare dataset is split into training (2013 to 2015 individual years) and test (2016 only). Using this Medicare case study, we assess the fraud detection performance, across three learners, for both model evaluation methods. We find that using the separate training and test sets generally outperforms cross-validation, indicating a better real-world model performance evaluation. Even so, cross-validation has comparable, but conservative, fraud detection results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信