Unsupervised Drift Detector Ensembles for Data Stream Mining

Lukasz Korycki, B. Krawczyk
{"title":"Unsupervised Drift Detector Ensembles for Data Stream Mining","authors":"Lukasz Korycki, B. Krawczyk","doi":"10.1109/DSAA.2019.00047","DOIUrl":null,"url":null,"abstract":"Data stream mining is among the most contemporary branches of machine learning. The potentially infinite sources give us many opportunities and at the same time pose new challenges. To properly handle streaming data we need to improve our well-established methods, so they can work with dynamic data and under strict constraints. Supervised streaming machine learning algorithms require a certain number of labeled instances in order to stay up-to-date. Since high budgets dedicated for this purpose are usually infeasible, we have to limit the supervision as much as we can. One possible approach is to trigger labeling, only if a change is explicitly indicated by a detector. While there are several supervised algorithms dedicated for this purpose, the more practical unsupervised ones are still lacking a proper attention. In this paper, we propose a novel unsupervised ensemble drift detector that recognizes local changes in feature subspaces (EDFS) without additional supervision, using specialized committees of incremental Kolmogorov-Smirnov tests. We combine it with an adaptive classifier and update it, only if the drift detector signalizes a change. Conducted experiments show that our framework is able to efficiently adapt to various concept drifts and outperform other unsupervised algorithms.","PeriodicalId":416037,"journal":{"name":"2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"2018 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSAA.2019.00047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Data stream mining is among the most contemporary branches of machine learning. The potentially infinite sources give us many opportunities and at the same time pose new challenges. To properly handle streaming data we need to improve our well-established methods, so they can work with dynamic data and under strict constraints. Supervised streaming machine learning algorithms require a certain number of labeled instances in order to stay up-to-date. Since high budgets dedicated for this purpose are usually infeasible, we have to limit the supervision as much as we can. One possible approach is to trigger labeling, only if a change is explicitly indicated by a detector. While there are several supervised algorithms dedicated for this purpose, the more practical unsupervised ones are still lacking a proper attention. In this paper, we propose a novel unsupervised ensemble drift detector that recognizes local changes in feature subspaces (EDFS) without additional supervision, using specialized committees of incremental Kolmogorov-Smirnov tests. We combine it with an adaptive classifier and update it, only if the drift detector signalizes a change. Conducted experiments show that our framework is able to efficiently adapt to various concept drifts and outperform other unsupervised algorithms.
用于数据流挖掘的无监督漂移检测器集成
数据流挖掘是机器学习最现代的分支之一。潜在的无限资源给我们带来了许多机会,同时也带来了新的挑战。为了正确处理流数据,我们需要改进现有的方法,使它们能够在严格的约束下处理动态数据。监督流机器学习算法需要一定数量的标记实例才能保持最新状态。由于专门用于此目的的高预算通常是不可行的,因此我们必须尽可能地限制监督。一种可能的方法是触发标记,只有当一个变化是由检测器明确指出。虽然有几种专门用于此目的的监督算法,但更实用的无监督算法仍然缺乏适当的关注。在本文中,我们提出了一种新的无监督集成漂移检测器,它可以在没有额外监督的情况下识别特征子空间(EDFS)的局部变化,使用增量Kolmogorov-Smirnov测试的专门委员会。我们将其与自适应分类器结合并更新它,只有当漂移检测器发出变化信号时。实验表明,我们的框架能够有效地适应各种概念漂移,并且优于其他无监督算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信