Evolving stream classification using change detection

A. M. Mustafa, Ahsanul Haque, L. Khan, M. Baron, B. Thuraisingham
{"title":"Evolving stream classification using change detection","authors":"A. M. Mustafa, Ahsanul Haque, L. Khan, M. Baron, B. Thuraisingham","doi":"10.4108/ICST.COLLABORATECOM.2014.257769","DOIUrl":null,"url":null,"abstract":"Classifying instances in evolving data stream is a challenging task because of its properties, e.g., infinite length, concept drift, and concept evolution. Most of the currently available approaches to classify stream data instances divide the stream data into fixed size chunks to fit the data in memory and process the fixed size chunk one after another. However, this may lead to failure of capturing the concept drift immediately. We try to determine the chunk size dynamically by exploiting change point detection (CPD) techniques on stream data. In general, the distribution families before and after the change point are unknown over the stream, therefore non-parametric CPD algorithms are used in this case. We propose a multi-dimensional non-parametric CPD technique to determine chunk boundary over data streams dynamically which leads to better models to classify instances of evolving data streams. Experimental results show that our approach can detect the change points and classify instances of evolving data stream with high accuracy as compared to other baseline approaches.","PeriodicalId":432345,"journal":{"name":"10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4108/ICST.COLLABORATECOM.2014.257769","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Classifying instances in evolving data stream is a challenging task because of its properties, e.g., infinite length, concept drift, and concept evolution. Most of the currently available approaches to classify stream data instances divide the stream data into fixed size chunks to fit the data in memory and process the fixed size chunk one after another. However, this may lead to failure of capturing the concept drift immediately. We try to determine the chunk size dynamically by exploiting change point detection (CPD) techniques on stream data. In general, the distribution families before and after the change point are unknown over the stream, therefore non-parametric CPD algorithms are used in this case. We propose a multi-dimensional non-parametric CPD technique to determine chunk boundary over data streams dynamically which leads to better models to classify instances of evolving data streams. Experimental results show that our approach can detect the change points and classify instances of evolving data stream with high accuracy as compared to other baseline approaches.
使用变更检测进化流分类
在演化数据流中对实例进行分类是一项具有挑战性的任务,因为它具有无限长、概念漂移和概念演化等特性。目前可用的流数据实例分类方法大多是将流数据划分为固定大小的块,以适应内存中的数据,然后逐个处理固定大小的块。然而,这可能导致无法立即捕捉到概念漂移。我们尝试通过利用流数据的变化点检测(CPD)技术来动态确定块大小。一般来说,变化点前后的分布族在流中是未知的,因此在这种情况下使用非参数CPD算法。我们提出了一种多维非参数CPD技术来动态确定数据流上的块边界,从而产生更好的模型来分类不断发展的数据流实例。实验结果表明,与其他基线方法相比,我们的方法可以检测到变化点并对不断变化的数据流实例进行分类,准确率很高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信