An Evolutionary Stream Clustering Technique for Outlier Detection

Nadilah Ayu Supardi, S. J. Abdulkadir, Norshakirah Aziz
{"title":"An Evolutionary Stream Clustering Technique for Outlier Detection","authors":"Nadilah Ayu Supardi, S. J. Abdulkadir, Norshakirah Aziz","doi":"10.1109/ICCI51257.2020.9247832","DOIUrl":null,"url":null,"abstract":"Clustering data streams appeared to be the most popular studies among the researchers due to their developing field. Data streams address numerous threats on clustering such as limited time, memory and single scan clustering. Besides, identifying arbitrary shapes clusters approach are very significant in data streams applications. Data streams are an infinite sequence of the element, evolve over time with no knowledge on the number of the clusters. Various factors such as some noise appear occasionally have the potential to negatively impact on data streams environment. The density-based technique is proven to be an astounding method in clustering data streams. It is computationally efficient to yield arbitrary shape clusters and detect noise immediately. Generally, it does not require the number of clusters in advance. Most of the traditional density-based clustering is not applicable in data streams due to its own characteristics. Nearly all traditional density-based clustering algorithms can be extended to the latest ones for data streams study purposes. This concept is mainly focused on the density-based technique in the clustering process to overcome the constraint from data streams nature. This paper proposes a preliminary result on a density-based algorithm (evoStream) for clustering which is to investigate outlier detection on three different real data sets named, KDDCup99, sensor and power supply. Later, this algorithm will be extended to optimize the model in detecting outlier on data streams.","PeriodicalId":194158,"journal":{"name":"2020 International Conference on Computational Intelligence (ICCI)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Computational Intelligence (ICCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCI51257.2020.9247832","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Clustering data streams appeared to be the most popular studies among the researchers due to their developing field. Data streams address numerous threats on clustering such as limited time, memory and single scan clustering. Besides, identifying arbitrary shapes clusters approach are very significant in data streams applications. Data streams are an infinite sequence of the element, evolve over time with no knowledge on the number of the clusters. Various factors such as some noise appear occasionally have the potential to negatively impact on data streams environment. The density-based technique is proven to be an astounding method in clustering data streams. It is computationally efficient to yield arbitrary shape clusters and detect noise immediately. Generally, it does not require the number of clusters in advance. Most of the traditional density-based clustering is not applicable in data streams due to its own characteristics. Nearly all traditional density-based clustering algorithms can be extended to the latest ones for data streams study purposes. This concept is mainly focused on the density-based technique in the clustering process to overcome the constraint from data streams nature. This paper proposes a preliminary result on a density-based algorithm (evoStream) for clustering which is to investigate outlier detection on three different real data sets named, KDDCup99, sensor and power supply. Later, this algorithm will be extended to optimize the model in detecting outlier on data streams.
一种用于离群点检测的进化流聚类技术
聚类数据流由于其研究领域的不断发展,似乎成为研究人员最热门的研究方向。数据流解决了集群面临的许多威胁,如有限的时间、内存和单扫描集群。此外,识别任意形状簇的方法在数据流应用中具有重要意义。数据流是元素的无限序列,随着时间的推移而进化,而不知道集群的数量。偶尔出现的一些噪声等各种因素可能会对数据流环境产生负面影响。基于密度的聚类技术被证明是一种令人惊叹的数据流聚类方法。计算效率高,可以产生任意形状的聚类并立即检测噪声。通常,它不需要预先确定集群的数量。传统的基于密度的聚类方法由于其自身的特点,大多不适用于数据流。几乎所有传统的基于密度的聚类算法都可以扩展到最新的数据流研究算法。该概念主要关注聚类过程中基于密度的技术,以克服来自数据流性质的约束。本文提出了一种基于密度的聚类算法(evoStream)的初步结果,该算法研究了KDDCup99、传感器和电源三种不同真实数据集上的离群点检测。随后,将对该算法进行扩展,优化模型在数据流异常点检测中的应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信