Clustering based fuzzy classification with a noise cluster in detecting fraud in insurance

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing Pub Date : 2024-10-30 DOI:10.1016/j.asoc.2024.112430

Oguz Koc , Furkan Baser , A. Sevtap Selcuk-Kestel

{"title":"Clustering based fuzzy classification with a noise cluster in detecting fraud in insurance","authors":"Oguz Koc , Furkan Baser , A. Sevtap Selcuk-Kestel","doi":"10.1016/j.asoc.2024.112430","DOIUrl":null,"url":null,"abstract":"<div><div>Fraud detection is one of the main issues in reducing the unsystematic risks in insurance business as its costs might reach to catastrophic amounts leading to higher loadings on reserves and premiums. Due to its cause of nature in diversity, fraud detection may require a wide range of factors and variables to be considered. To make logical relations between many factors and reveal their differences, estimate odds (or probabilities), and predict the fraud risk, scoring systems become an important aid. In this paper, we introduce a clustering-based fuzzy classification with a noise cluster (CBFCN) to identify the true state of a fraud. The approach proposed in this paper is based on fuzzy k-means clustering having a noise cluster (FKMN) and is a novel method for identifying outliers by achieving robust clustering. We integrate fuzzy theory to boost the prediction ability of machine learning (ML) approaches for a proper determination of the contributing features. The two critical features of the CBFCN method which are the membership values obtained from the FKMN clustering algorithm are implemented to capture the behavior of an existing structure better and detect the noise (extremes) in the dataset. Extensive analyses are made on two real datasets exposing different characteristics in their variables to demonstrate how CBFCN performs in detecting the fraud compared to the conventional approaches. Additionally, employing fuzzy approach to improve the ML performance is elaborated through the inclusion of noise clusters. The findings indicate that the suggested CBFCN models produce promising classification results in fraud detection in insurance claims occurrences.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"167 ","pages":"Article 112430"},"PeriodicalIF":7.2000,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494624012043","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Fraud detection is one of the main issues in reducing the unsystematic risks in insurance business as its costs might reach to catastrophic amounts leading to higher loadings on reserves and premiums. Due to its cause of nature in diversity, fraud detection may require a wide range of factors and variables to be considered. To make logical relations between many factors and reveal their differences, estimate odds (or probabilities), and predict the fraud risk, scoring systems become an important aid. In this paper, we introduce a clustering-based fuzzy classification with a noise cluster (CBFCN) to identify the true state of a fraud. The approach proposed in this paper is based on fuzzy k-means clustering having a noise cluster (FKMN) and is a novel method for identifying outliers by achieving robust clustering. We integrate fuzzy theory to boost the prediction ability of machine learning (ML) approaches for a proper determination of the contributing features. The two critical features of the CBFCN method which are the membership values obtained from the FKMN clustering algorithm are implemented to capture the behavior of an existing structure better and detect the noise (extremes) in the dataset. Extensive analyses are made on two real datasets exposing different characteristics in their variables to demonstrate how CBFCN performs in detecting the fraud compared to the conventional approaches. Additionally, employing fuzzy approach to improve the ML performance is elaborated through the inclusion of noise clusters. The findings indicate that the suggested CBFCN models produce promising classification results in fraud detection in insurance claims occurrences.

查看原文本刊更多论文

基于聚类的模糊分类与噪声聚类在检测保险欺诈中的应用

欺诈检测是降低保险业务非系统性风险的主要问题之一，因为其成本可能达到灾难性的程度，导致准备金和保费的增加。由于其原因的多样性，欺诈检测可能需要考虑广泛的因素和变量。要在众多因素之间建立逻辑关系并揭示它们之间的差异，估计几率（或概率）并预测欺诈风险，评分系统就成了重要的辅助工具。在本文中，我们介绍了一种基于聚类的带噪声聚类的模糊分类法（CBFCN）来识别欺诈的真实情况。本文提出的方法基于有噪声聚类的模糊均值聚类（FKMN），是一种通过实现鲁棒聚类来识别异常值的新方法。我们整合了模糊理论，以提高机器学习（ML）方法的预测能力，从而正确确定贡献特征。CBFCN 方法的两个关键特征是从 FKMN 聚类算法中获得的成员值，通过实施这两个特征，可以更好地捕捉现有结构的行为，并检测数据集中的噪声（极端值）。我们在两个真实数据集上进行了广泛的分析，这些数据集暴露了其变量的不同特征，从而展示了 CBFCN 与传统方法相比在检测欺诈方面的表现。此外，还详细阐述了通过加入噪声簇来采用模糊方法提高 ML 性能的方法。研究结果表明，建议的 CBFCN 模型在保险理赔中的欺诈检测方面产生了良好的分类结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Soft Computing 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

6.90%

发文量

874

审稿时长

10.9 months

期刊介绍： Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.