基于k -均值算法的数据流异常点检测技术综述

Prashant Chauhan, M. Shukla
{"title":"基于k -均值算法的数据流异常点检测技术综述","authors":"Prashant Chauhan, M. Shukla","doi":"10.1109/ICACEA.2015.7164758","DOIUrl":null,"url":null,"abstract":"Data Stream mining has gained attraction from many researchers as there is need to mine large dataset which pose different challenges for researchers. Stream data is different compared to normal data as they are continuously produced from different applications which impose different challenges like massive, infinite, concept drift for processing. An object that does not obey the behavior of normal data object is called outliers. Outlier detection is used in different applications like fraud detection, intrusion detection, track environmental changes, medical diagnosis so there is need to detect outliers from data streams. Various approaches are used for outlier detection. Some of them use K-Means algorithm for outlier detection in data streams which help to create a similar group or cluster of data points. Data stream clustering techniques are highly helpful to cluster similar data items in data streams and also to detect the outliers from them, so they are called cluster based outlier detection. K-means algorithm is partition based algorithm which is used for clustering datasets into number of clusters. It is most common and popular algorithm for clustering due to its simplicity and efficiency. Purpose of this paper is to review of different approaches of outlier detection which is used for K-Means algorithm for clustering dataset with some other methods. Different application areas of outlier detection are discussed in this paper.","PeriodicalId":202893,"journal":{"name":"2015 International Conference on Advances in Computer Engineering and Applications","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"31","resultStr":"{\"title\":\"A review on outlier detection techniques on data stream by using different approaches of K-Means algorithm\",\"authors\":\"Prashant Chauhan, M. Shukla\",\"doi\":\"10.1109/ICACEA.2015.7164758\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data Stream mining has gained attraction from many researchers as there is need to mine large dataset which pose different challenges for researchers. Stream data is different compared to normal data as they are continuously produced from different applications which impose different challenges like massive, infinite, concept drift for processing. An object that does not obey the behavior of normal data object is called outliers. Outlier detection is used in different applications like fraud detection, intrusion detection, track environmental changes, medical diagnosis so there is need to detect outliers from data streams. Various approaches are used for outlier detection. Some of them use K-Means algorithm for outlier detection in data streams which help to create a similar group or cluster of data points. Data stream clustering techniques are highly helpful to cluster similar data items in data streams and also to detect the outliers from them, so they are called cluster based outlier detection. K-means algorithm is partition based algorithm which is used for clustering datasets into number of clusters. It is most common and popular algorithm for clustering due to its simplicity and efficiency. Purpose of this paper is to review of different approaches of outlier detection which is used for K-Means algorithm for clustering dataset with some other methods. Different application areas of outlier detection are discussed in this paper.\",\"PeriodicalId\":202893,\"journal\":{\"name\":\"2015 International Conference on Advances in Computer Engineering and Applications\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-03-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"31\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Advances in Computer Engineering and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACEA.2015.7164758\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Advances in Computer Engineering and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACEA.2015.7164758","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 31

摘要

由于数据流挖掘需要挖掘大量数据,这给研究人员带来了不同的挑战,因此数据流挖掘受到了许多研究人员的关注。流数据与普通数据不同,因为它们是由不同的应用程序连续产生的,这些应用程序对处理过程提出了不同的挑战,比如海量的、无限的、概念漂移。不遵循正常数据对象行为的对象称为离群值。异常值检测用于欺诈检测、入侵检测、跟踪环境变化、医疗诊断等不同的应用中,因此需要从数据流中检测异常值。不同的方法被用于异常值检测。其中一些使用K-Means算法在数据流中进行离群值检测,这有助于创建类似的数据点组或簇。数据流聚类技术有助于对数据流中相似的数据项进行聚类,并从中检测出异常值,因此被称为基于聚类的异常值检测。K-means算法是一种基于分区的算法,用于将数据集聚成若干个簇。由于它的简单和高效,它是最常见和流行的聚类算法。本文综述了K-Means聚类算法中不同的离群点检测方法和其他方法。本文讨论了离群值检测的不同应用领域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A review on outlier detection techniques on data stream by using different approaches of K-Means algorithm
Data Stream mining has gained attraction from many researchers as there is need to mine large dataset which pose different challenges for researchers. Stream data is different compared to normal data as they are continuously produced from different applications which impose different challenges like massive, infinite, concept drift for processing. An object that does not obey the behavior of normal data object is called outliers. Outlier detection is used in different applications like fraud detection, intrusion detection, track environmental changes, medical diagnosis so there is need to detect outliers from data streams. Various approaches are used for outlier detection. Some of them use K-Means algorithm for outlier detection in data streams which help to create a similar group or cluster of data points. Data stream clustering techniques are highly helpful to cluster similar data items in data streams and also to detect the outliers from them, so they are called cluster based outlier detection. K-means algorithm is partition based algorithm which is used for clustering datasets into number of clusters. It is most common and popular algorithm for clustering due to its simplicity and efficiency. Purpose of this paper is to review of different approaches of outlier detection which is used for K-Means algorithm for clustering dataset with some other methods. Different application areas of outlier detection are discussed in this paper.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信