利用DBSCAN和K-Means检测住院患者理赔中的异常值

Panca Oktavia Candra Sari, Suharjito Suharjito
{"title":"利用DBSCAN和K-Means检测住院患者理赔中的异常值","authors":"Panca Oktavia Candra Sari, Suharjito Suharjito","doi":"10.15408/jti.v15i1.25682","DOIUrl":null,"url":null,"abstract":"Health insurance helps people to obtain quality and affordable health services. The claim billing process is manually input code to the system, this can lack of errors and be suspected of being fraudulent. Claims suspected of fraud are traced manually to find incorrect inputs. The increasing volume of claims causes a decrease in the accuracy of tracing claims suspected of fraud and consumes time and energy. As an effort to prevent and reduce the occurrence of fraud, this study aims to determine the pattern of data on the occurrence of fraud based on the formation of data groupings. Data was prepared by combining claims for inpatient bills and patient bills from hospitals in 2020. Two methods were used in this study to form clusters, DBSCAN and KMeans. To find out the outliers in the cluster, Local Outlier Factor (LOF) was added. The results from experiments show that both methods can detect outlier data and distribute outlier data in the formed cluster. Variable that high effect becomes data outlier is the length of stay, claims code, and condition of patient when discharged from the hospital. Accuracy K-Means is 0.391, 0.003 higher than DBSCAN, which is 0.389.","PeriodicalId":52586,"journal":{"name":"Jurnal Sarjana Teknik Informatika","volume":"56 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Outlier Detection in Inpatient Claims Using DBSCAN and K-Means\",\"authors\":\"Panca Oktavia Candra Sari, Suharjito Suharjito\",\"doi\":\"10.15408/jti.v15i1.25682\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Health insurance helps people to obtain quality and affordable health services. The claim billing process is manually input code to the system, this can lack of errors and be suspected of being fraudulent. Claims suspected of fraud are traced manually to find incorrect inputs. The increasing volume of claims causes a decrease in the accuracy of tracing claims suspected of fraud and consumes time and energy. As an effort to prevent and reduce the occurrence of fraud, this study aims to determine the pattern of data on the occurrence of fraud based on the formation of data groupings. Data was prepared by combining claims for inpatient bills and patient bills from hospitals in 2020. Two methods were used in this study to form clusters, DBSCAN and KMeans. To find out the outliers in the cluster, Local Outlier Factor (LOF) was added. The results from experiments show that both methods can detect outlier data and distribute outlier data in the formed cluster. Variable that high effect becomes data outlier is the length of stay, claims code, and condition of patient when discharged from the hospital. Accuracy K-Means is 0.391, 0.003 higher than DBSCAN, which is 0.389.\",\"PeriodicalId\":52586,\"journal\":{\"name\":\"Jurnal Sarjana Teknik Informatika\",\"volume\":\"56 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Jurnal Sarjana Teknik Informatika\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15408/jti.v15i1.25682\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Sarjana Teknik Informatika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15408/jti.v15i1.25682","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

健康保险帮助人们获得高质量和负担得起的保健服务。理赔计费过程是手动向系统输入代码,这样会缺少错误而被怀疑为欺诈。对涉嫌欺诈的索赔进行人工追踪,以查找不正确的输入。索赔数量的增加导致追查涉嫌欺诈索赔的准确性下降,并消耗时间和精力。为了防止和减少欺诈的发生,本研究旨在通过数据分组的形成来确定欺诈发生的数据模式。数据是通过综合2020年医院住院账单和患者账单的索赔来编制的。本研究采用DBSCAN和KMeans两种方法进行聚类。为了找出集群中的异常点,加入了局部异常因子(LOF)。实验结果表明,这两种方法都能检测出离群数据,并将离群数据分布在形成的聚类中。高影响成为数据异常值的变量是住院时间、索赔代码和患者出院时的状况。精度K-Means为0.391,比DBSCAN的0.389高0.003。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Outlier Detection in Inpatient Claims Using DBSCAN and K-Means
Health insurance helps people to obtain quality and affordable health services. The claim billing process is manually input code to the system, this can lack of errors and be suspected of being fraudulent. Claims suspected of fraud are traced manually to find incorrect inputs. The increasing volume of claims causes a decrease in the accuracy of tracing claims suspected of fraud and consumes time and energy. As an effort to prevent and reduce the occurrence of fraud, this study aims to determine the pattern of data on the occurrence of fraud based on the formation of data groupings. Data was prepared by combining claims for inpatient bills and patient bills from hospitals in 2020. Two methods were used in this study to form clusters, DBSCAN and KMeans. To find out the outliers in the cluster, Local Outlier Factor (LOF) was added. The results from experiments show that both methods can detect outlier data and distribute outlier data in the formed cluster. Variable that high effect becomes data outlier is the length of stay, claims code, and condition of patient when discharged from the hospital. Accuracy K-Means is 0.391, 0.003 higher than DBSCAN, which is 0.389.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
15
审稿时长
8 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信