{"title":"基于稀疏子空间聚类欠采样的信贷数据异常检测","authors":"Ruyao Sun, Lingling Wang, Jinping Tang, Bo Bi","doi":"10.1109/ITNEC56291.2023.10082623","DOIUrl":null,"url":null,"abstract":"The promotion of digitalization has brought many emerging risks to the internet finance and other fields. For example, the fraudulent behavior in credit data can be regarded as outliers, which means that outliers themselves have very important significance. However, due to the high dimension of the real data set and the small number of outliers, most anomaly detection algorithms directly based on clustering are not effective, so it is necessary to find a method that can effectively solve the anomaly detection of non-balanced credit data sets in high-dimensional space. This paper detects outliers in credit data based on sparse subspace clustering undersampling, uses it to cluster high-dimensional and unbalanced credit data sets, uses clustering results as undersampling means to construct balanced data sets, and then uses classifier to detect outliers. Finally, the effectiveness of the proposed algorithm in credit data outlier detection is verified by comparative experiments, which makes up for the shortcomings of traditional clustering and high-dimensional space outlier detection algorithms.","PeriodicalId":218770,"journal":{"name":"2023 IEEE 6th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Anomaly Detection of Credit Data based on Sparse Subspace Clustering Undersampling\",\"authors\":\"Ruyao Sun, Lingling Wang, Jinping Tang, Bo Bi\",\"doi\":\"10.1109/ITNEC56291.2023.10082623\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The promotion of digitalization has brought many emerging risks to the internet finance and other fields. For example, the fraudulent behavior in credit data can be regarded as outliers, which means that outliers themselves have very important significance. However, due to the high dimension of the real data set and the small number of outliers, most anomaly detection algorithms directly based on clustering are not effective, so it is necessary to find a method that can effectively solve the anomaly detection of non-balanced credit data sets in high-dimensional space. This paper detects outliers in credit data based on sparse subspace clustering undersampling, uses it to cluster high-dimensional and unbalanced credit data sets, uses clustering results as undersampling means to construct balanced data sets, and then uses classifier to detect outliers. Finally, the effectiveness of the proposed algorithm in credit data outlier detection is verified by comparative experiments, which makes up for the shortcomings of traditional clustering and high-dimensional space outlier detection algorithms.\",\"PeriodicalId\":218770,\"journal\":{\"name\":\"2023 IEEE 6th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 6th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITNEC56291.2023.10082623\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 6th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITNEC56291.2023.10082623","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Anomaly Detection of Credit Data based on Sparse Subspace Clustering Undersampling
The promotion of digitalization has brought many emerging risks to the internet finance and other fields. For example, the fraudulent behavior in credit data can be regarded as outliers, which means that outliers themselves have very important significance. However, due to the high dimension of the real data set and the small number of outliers, most anomaly detection algorithms directly based on clustering are not effective, so it is necessary to find a method that can effectively solve the anomaly detection of non-balanced credit data sets in high-dimensional space. This paper detects outliers in credit data based on sparse subspace clustering undersampling, uses it to cluster high-dimensional and unbalanced credit data sets, uses clustering results as undersampling means to construct balanced data sets, and then uses classifier to detect outliers. Finally, the effectiveness of the proposed algorithm in credit data outlier detection is verified by comparative experiments, which makes up for the shortcomings of traditional clustering and high-dimensional space outlier detection algorithms.