Incremental learning of concept drift in nonstationary environments.

IEEE transactions on neural networks Pub Date : 2011-10-01 Epub Date: 2011-08-04 DOI:10.1109/TNN.2011.2160459

Ryan Elwell, Robi Polikar

{"title":"Incremental learning of concept drift in nonstationary environments.","authors":"Ryan Elwell, Robi Polikar","doi":"10.1109/TNN.2011.2160459","DOIUrl":null,"url":null,"abstract":"<p><p>We introduce an ensemble of classifiers-based approach for incremental learning of concept drift, characterized by nonstationary environments (NSEs), where the underlying data distributions change over time. The proposed algorithm, named Learn(++). NSE, learns from consecutive batches of data without making any assumptions on the nature or rate of drift; it can learn from such environments that experience constant or variable rate of drift, addition or deletion of concept classes, as well as cyclical drift. The algorithm learns incrementally, as other members of the Learn(++) family of algorithms, that is, without requiring access to previously seen data. Learn(++). NSE trains one new classifier for each batch of data it receives, and combines these classifiers using a dynamically weighted majority voting. The novelty of the approach is in determining the voting weights, based on each classifier's time-adjusted accuracy on current and past environments. This approach allows the algorithm to recognize, and act accordingly, to the changes in underlying data distributions, as well as to a possible reoccurrence of an earlier distribution. We evaluate the algorithm on several synthetic datasets designed to simulate a variety of nonstationary environments, as well as a real-world weather prediction dataset. Comparisons with several other approaches are also included. Results indicate that Learn(++). NSE can track the changing environments very closely, regardless of the type of concept drift. To allow future use, comparison and benchmarking by interested researchers, we also release our data used in this paper.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 10","pages":"1517-31"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2160459","citationCount":"765","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TNN.2011.2160459","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2011/8/4 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 765

Abstract

We introduce an ensemble of classifiers-based approach for incremental learning of concept drift, characterized by nonstationary environments (NSEs), where the underlying data distributions change over time. The proposed algorithm, named Learn(++). NSE, learns from consecutive batches of data without making any assumptions on the nature or rate of drift; it can learn from such environments that experience constant or variable rate of drift, addition or deletion of concept classes, as well as cyclical drift. The algorithm learns incrementally, as other members of the Learn(++) family of algorithms, that is, without requiring access to previously seen data. Learn(++). NSE trains one new classifier for each batch of data it receives, and combines these classifiers using a dynamically weighted majority voting. The novelty of the approach is in determining the voting weights, based on each classifier's time-adjusted accuracy on current and past environments. This approach allows the algorithm to recognize, and act accordingly, to the changes in underlying data distributions, as well as to a possible reoccurrence of an earlier distribution. We evaluate the algorithm on several synthetic datasets designed to simulate a variety of nonstationary environments, as well as a real-world weather prediction dataset. Comparisons with several other approaches are also included. Results indicate that Learn(++). NSE can track the changing environments very closely, regardless of the type of concept drift. To allow future use, comparison and benchmarking by interested researchers, we also release our data used in this paper.

查看原文本刊更多论文

非平稳环境下概念漂移的增量学习。

我们引入了一种基于分类器的集成方法，用于概念漂移的增量学习，其特征是非平稳环境(nse)，其中底层数据分布随时间变化。提出的算法命名为Learn(++)。NSE，从连续批次的数据中学习，而不对漂移的性质或速率做任何假设;它可以从这样的环境中学习，经历恒定或可变的漂移，概念类的增加或删除，以及周期性漂移。与Learn(++)算法家族的其他成员一样，该算法以增量方式学习，也就是说，不需要访问以前看到的数据。学习(+ +)。NSE为它接收到的每一批数据训练一个新的分类器，并使用动态加权多数投票将这些分类器组合在一起。该方法的新颖之处在于根据每个分类器对当前和过去环境的时间调整精度来确定投票权重。这种方法允许算法识别底层数据分布的变化并采取相应的行动，以及早期分布可能再次出现的情况。我们在几个合成数据集上评估了该算法，这些数据集旨在模拟各种非平稳环境，以及真实世界的天气预报数据集。还包括与其他几种方法的比较。结果表明:Learn(++)。无论概念漂移的类型如何，NSE都可以非常密切地跟踪变化的环境。为了让有兴趣的研究人员将来使用，比较和基准，我们也发布了我们在本文中使用的数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on neural networks 工程技术-工程：电子与电气

自引率

0.00%

发文量

审稿时长

8.7 months