Joffrey L. Leevy, John T. Hancock, T. Khoshgoftaar
{"title":"评估识别医疗保险欺诈的一类和二元分类方法","authors":"Joffrey L. Leevy, John T. Hancock, T. Khoshgoftaar","doi":"10.1109/IRI58017.2023.00053","DOIUrl":null,"url":null,"abstract":"Machine learning research on Medicare fraud detection is of national importance, primarily due to the extensive financial losses caused by this deceptive practice. Our big data study focuses on the Medicare Part D dataset, which we utilize to detect healthcare fraud perpetrated by physicians. In this paper, we compare and contrast One-Class Classification (OCC) and binary classification by examining eight different classifiers. The metrics applied in this analysis are Area Under the Receiver Operating Characteristic Curve (AUC) and Area Under the Precision-Recall Curve (AUPRC). Our findings indicate that binary classification outperforms OCC in Medicare fraud detection. Furthermore, we establish that the Decision Tree-based classifiers employed in the research are the most effective, with CatBoost delivering the best performance.","PeriodicalId":290818,"journal":{"name":"2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"141 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessing One-Class and Binary Classification Approaches for Identifying Medicare Fraud\",\"authors\":\"Joffrey L. Leevy, John T. Hancock, T. Khoshgoftaar\",\"doi\":\"10.1109/IRI58017.2023.00053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning research on Medicare fraud detection is of national importance, primarily due to the extensive financial losses caused by this deceptive practice. Our big data study focuses on the Medicare Part D dataset, which we utilize to detect healthcare fraud perpetrated by physicians. In this paper, we compare and contrast One-Class Classification (OCC) and binary classification by examining eight different classifiers. The metrics applied in this analysis are Area Under the Receiver Operating Characteristic Curve (AUC) and Area Under the Precision-Recall Curve (AUPRC). Our findings indicate that binary classification outperforms OCC in Medicare fraud detection. Furthermore, we establish that the Decision Tree-based classifiers employed in the research are the most effective, with CatBoost delivering the best performance.\",\"PeriodicalId\":290818,\"journal\":{\"name\":\"2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)\",\"volume\":\"141 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IRI58017.2023.00053\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI58017.2023.00053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Assessing One-Class and Binary Classification Approaches for Identifying Medicare Fraud
Machine learning research on Medicare fraud detection is of national importance, primarily due to the extensive financial losses caused by this deceptive practice. Our big data study focuses on the Medicare Part D dataset, which we utilize to detect healthcare fraud perpetrated by physicians. In this paper, we compare and contrast One-Class Classification (OCC) and binary classification by examining eight different classifiers. The metrics applied in this analysis are Area Under the Receiver Operating Characteristic Curve (AUC) and Area Under the Precision-Recall Curve (AUPRC). Our findings indicate that binary classification outperforms OCC in Medicare fraud detection. Furthermore, we establish that the Decision Tree-based classifiers employed in the research are the most effective, with CatBoost delivering the best performance.