基于改进的 k 近邻粗糙集的异常检测

IF 3.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Approximate Reasoning Pub Date : 2024-11-13 DOI:10.1016/j.ijar.2024.109323

Xiwen Chen , Zhong Yuan , Shan Feng

{"title":"基于改进的 k 近邻粗糙集的异常检测","authors":"Xiwen Chen , Zhong Yuan , Shan Feng","doi":"10.1016/j.ijar.2024.109323","DOIUrl":null,"url":null,"abstract":"<div><div>Neighborhood rough set model is a resultful model for processing incomplete, imprecise, and other uncertain data. It has been used in several fields, such as anomaly detection and data classification. However, most of the current neighborhood rough set models suffer from the issues of unreasonable neighborhood radius determination and poor adaptability. To obtain an adaptive neighborhood radius and make granulation results more reasonable, an improved <em>k</em>-nearest neighbor rough set model is proposed in the paper by introducing <em>k</em>th-distance as the <em>k</em>-nearest neighborhood radius, and an anomaly detection model is constructed. In the method, the <em>k</em>-nearest neighborhood radius is used to calculate the <em>k</em>-nearest neighbor relation firstly. Then, the anomaly degree of granule (GAD) is defined to measure the anomaly degree of <em>k</em>-nearest neighbor granules by combining approximation accuracy with the local density. Furthermore, the GADs of an object's <em>k</em>-nearest neighbor granules generated by different attribute subsets are calculated, and the anomaly score (AS) is constructed. Finally, an anomaly detection algorithm is designed. Some mainstream anomaly detection methods are compared with the proposed method on public datasets. The results indicate that the capability of detecting anomalies of the proposed approach outperforms current detection methods.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"176 ","pages":"Article 109323"},"PeriodicalIF":3.2000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Anomaly detection based on improved k-nearest neighbor rough sets\",\"authors\":\"Xiwen Chen , Zhong Yuan , Shan Feng\",\"doi\":\"10.1016/j.ijar.2024.109323\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Neighborhood rough set model is a resultful model for processing incomplete, imprecise, and other uncertain data. It has been used in several fields, such as anomaly detection and data classification. However, most of the current neighborhood rough set models suffer from the issues of unreasonable neighborhood radius determination and poor adaptability. To obtain an adaptive neighborhood radius and make granulation results more reasonable, an improved <em>k</em>-nearest neighbor rough set model is proposed in the paper by introducing <em>k</em>th-distance as the <em>k</em>-nearest neighborhood radius, and an anomaly detection model is constructed. In the method, the <em>k</em>-nearest neighborhood radius is used to calculate the <em>k</em>-nearest neighbor relation firstly. Then, the anomaly degree of granule (GAD) is defined to measure the anomaly degree of <em>k</em>-nearest neighbor granules by combining approximation accuracy with the local density. Furthermore, the GADs of an object's <em>k</em>-nearest neighbor granules generated by different attribute subsets are calculated, and the anomaly score (AS) is constructed. Finally, an anomaly detection algorithm is designed. Some mainstream anomaly detection methods are compared with the proposed method on public datasets. The results indicate that the capability of detecting anomalies of the proposed approach outperforms current detection methods.</div></div>\",\"PeriodicalId\":13842,\"journal\":{\"name\":\"International Journal of Approximate Reasoning\",\"volume\":\"176 \",\"pages\":\"Article 109323\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-11-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Approximate Reasoning\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0888613X2400210X\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Approximate Reasoning","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0888613X2400210X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

邻域粗糙集模型是一种处理不完整、不精确和其他不确定数据的有效模型。它已被用于异常检测和数据分类等多个领域。然而，目前大多数邻域粗糙集模型都存在邻域半径确定不合理、适应性差等问题。为了获得自适应的邻域半径，使粒化结果更加合理，本文提出了一种改进的 k 近邻粗糙集模型，引入第 k 次距离作为 k 近邻半径，并构建了异常检测模型。在该方法中，首先利用 k 近邻半径计算 k 近邻关系。然后定义颗粒异常度（GAD），通过结合近似精度和局部密度来衡量 k 近邻颗粒的异常度。此外，还计算了由不同属性子集生成的对象 k 近邻颗粒的 GAD，并构建了异常得分（AS）。最后，设计异常检测算法。在公共数据集上，将一些主流异常检测方法与所提出的方法进行了比较。结果表明，所提方法的异常检测能力优于当前的检测方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Anomaly detection based on improved k-nearest neighbor rough sets

Neighborhood rough set model is a resultful model for processing incomplete, imprecise, and other uncertain data. It has been used in several fields, such as anomaly detection and data classification. However, most of the current neighborhood rough set models suffer from the issues of unreasonable neighborhood radius determination and poor adaptability. To obtain an adaptive neighborhood radius and make granulation results more reasonable, an improved k-nearest neighbor rough set model is proposed in the paper by introducing kth-distance as the k-nearest neighborhood radius, and an anomaly detection model is constructed. In the method, the k-nearest neighborhood radius is used to calculate the k-nearest neighbor relation firstly. Then, the anomaly degree of granule (GAD) is defined to measure the anomaly degree of k-nearest neighbor granules by combining approximation accuracy with the local density. Furthermore, the GADs of an object's k-nearest neighbor granules generated by different attribute subsets are calculated, and the anomaly score (AS) is constructed. Finally, an anomaly detection algorithm is designed. Some mainstream anomaly detection methods are compared with the proposed method on public datasets. The results indicate that the capability of detecting anomalies of the proposed approach outperforms current detection methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Approximate Reasoning 工程技术-计算机：人工智能

CiteScore

6.90

自引率

12.80%

发文量

170

审稿时长

67 days

期刊介绍： The International Journal of Approximate Reasoning is intended to serve as a forum for the treatment of imprecision and uncertainty in Artificial and Computational Intelligence, covering both the foundations of uncertainty theories, and the design of intelligent systems for scientific and engineering applications. It publishes high-quality research papers describing theoretical developments or innovative applications, as well as review articles on topics of general interest. Relevant topics include, but are not limited to, probabilistic reasoning and Bayesian networks, imprecise probabilities, random sets, belief functions (Dempster-Shafer theory), possibility theory, fuzzy sets, rough sets, decision theory, non-additive measures and integrals, qualitative reasoning about uncertainty, comparative probability orderings, game-theoretic probability, default reasoning, nonstandard logics, argumentation systems, inconsistency tolerant reasoning, elicitation techniques, philosophical foundations and psychological models of uncertain reasoning. Domains of application for uncertain reasoning systems include risk analysis and assessment, information retrieval and database design, information fusion, machine learning, data and web mining, computer vision, image and signal processing, intelligent data analysis, statistics, multi-agent systems, etc.