{"title":"Diversity in Ensemble Model for Classification of Data Streams with Concept Drift","authors":"Michal Kolárik, M. Sarnovský, Ján Paralič","doi":"10.1109/SAMI50585.2021.9378625","DOIUrl":null,"url":null,"abstract":"Data streams can be defined as the continuous stream of data in many forms coming from different sources. Data streams are usually non-stationary with continually changing their underlying structure. Solving of predictive or classification tasks on such data must consider this aspect. Traditional machine learning models applied on the drifting data may become invalid in the case when a concept change appears. To tackle this problem, we must utilize special adaptive learning models, which utilize various tools able to reflect the drifting data. One of the most popular groups of such methods are adaptive ensembles. This paper describes the work focused on the design and implementation of a novel adaptive ensemble learning model, which is based on the construction of a robust ensemble consisting of a heterogeneous set of its members. We used k-NN, Naive Bayes and Hoeffding trees as base learners and implemented an update mechanism, which considers dynamic class-weighting and Q statistics diversity calculation to ensure the diversity of the ensemble. The model was experimentally evaluated on the streaming datasets, and the effects of the diversity calculation were analyzed.","PeriodicalId":402414,"journal":{"name":"2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI)","volume":"125 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAMI50585.2021.9378625","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Data streams can be defined as the continuous stream of data in many forms coming from different sources. Data streams are usually non-stationary with continually changing their underlying structure. Solving of predictive or classification tasks on such data must consider this aspect. Traditional machine learning models applied on the drifting data may become invalid in the case when a concept change appears. To tackle this problem, we must utilize special adaptive learning models, which utilize various tools able to reflect the drifting data. One of the most popular groups of such methods are adaptive ensembles. This paper describes the work focused on the design and implementation of a novel adaptive ensemble learning model, which is based on the construction of a robust ensemble consisting of a heterogeneous set of its members. We used k-NN, Naive Bayes and Hoeffding trees as base learners and implemented an update mechanism, which considers dynamic class-weighting and Q statistics diversity calculation to ensure the diversity of the ensemble. The model was experimentally evaluated on the streaming datasets, and the effects of the diversity calculation were analyzed.