Diversity in Ensemble Model for Classification of Data Streams with Concept Drift

Michal Kolárik, M. Sarnovský, Ján Paralič
{"title":"Diversity in Ensemble Model for Classification of Data Streams with Concept Drift","authors":"Michal Kolárik, M. Sarnovský, Ján Paralič","doi":"10.1109/SAMI50585.2021.9378625","DOIUrl":null,"url":null,"abstract":"Data streams can be defined as the continuous stream of data in many forms coming from different sources. Data streams are usually non-stationary with continually changing their underlying structure. Solving of predictive or classification tasks on such data must consider this aspect. Traditional machine learning models applied on the drifting data may become invalid in the case when a concept change appears. To tackle this problem, we must utilize special adaptive learning models, which utilize various tools able to reflect the drifting data. One of the most popular groups of such methods are adaptive ensembles. This paper describes the work focused on the design and implementation of a novel adaptive ensemble learning model, which is based on the construction of a robust ensemble consisting of a heterogeneous set of its members. We used k-NN, Naive Bayes and Hoeffding trees as base learners and implemented an update mechanism, which considers dynamic class-weighting and Q statistics diversity calculation to ensure the diversity of the ensemble. The model was experimentally evaluated on the streaming datasets, and the effects of the diversity calculation were analyzed.","PeriodicalId":402414,"journal":{"name":"2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI)","volume":"125 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAMI50585.2021.9378625","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Data streams can be defined as the continuous stream of data in many forms coming from different sources. Data streams are usually non-stationary with continually changing their underlying structure. Solving of predictive or classification tasks on such data must consider this aspect. Traditional machine learning models applied on the drifting data may become invalid in the case when a concept change appears. To tackle this problem, we must utilize special adaptive learning models, which utilize various tools able to reflect the drifting data. One of the most popular groups of such methods are adaptive ensembles. This paper describes the work focused on the design and implementation of a novel adaptive ensemble learning model, which is based on the construction of a robust ensemble consisting of a heterogeneous set of its members. We used k-NN, Naive Bayes and Hoeffding trees as base learners and implemented an update mechanism, which considers dynamic class-weighting and Q statistics diversity calculation to ensure the diversity of the ensemble. The model was experimentally evaluated on the streaming datasets, and the effects of the diversity calculation were analyzed.
概念漂移数据流分类集成模型的多样性
数据流可以定义为来自不同来源的多种形式的连续数据流。数据流通常是非平稳的,其底层结构不断变化。解决基于此类数据的预测或分类任务必须考虑这一方面。在概念发生变化的情况下,应用于漂移数据的传统机器学习模型可能会失效。为了解决这个问题,我们必须利用特殊的自适应学习模型,该模型利用各种能够反映漂移数据的工具。这类方法中最流行的一组是自适应集成。本文描述了一种新的自适应集成学习模型的设计和实现,该模型基于由其成员的异构集合组成的鲁棒集成的构建。我们使用k-NN、朴素贝叶斯和Hoeffding树作为基础学习器,并实现了一种更新机制,该机制考虑了动态类加权和Q统计多样性计算,以确保集合的多样性。在流数据集上对该模型进行了实验验证,并分析了分集计算的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信