{"title":"Active learning over evolving data streams using paired ensemble framework","authors":"Wenhua Xu, Fengfei Zhao, Zhengcai Lu","doi":"10.1109/ICACI.2016.7449823","DOIUrl":null,"url":null,"abstract":"Stream data is considered as one of the main sources of big data. The inherent scarcity of labeled instances and the underlying concept drift have posed significant challenges on stream data classification in practice. A paired ensemble active learning framework is proposed to tackle the challenges. First, an ensemble model consists of two base classifiers is exploited to detect the changes over time, as well as to make prediction on new instances. Second, two active learning strategies work alternatively to find out the most informative instances without missing the potential changes happened anywhere in the instance space. Third, the informativeness of an instance is measured by a margin based metric, and it can effectively capture uncertain instances. Experimental results on real-world datasets demonstrate that the proposed approach can achieve good predictive accuracy on data streams.","PeriodicalId":211040,"journal":{"name":"2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACI.2016.7449823","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Stream data is considered as one of the main sources of big data. The inherent scarcity of labeled instances and the underlying concept drift have posed significant challenges on stream data classification in practice. A paired ensemble active learning framework is proposed to tackle the challenges. First, an ensemble model consists of two base classifiers is exploited to detect the changes over time, as well as to make prediction on new instances. Second, two active learning strategies work alternatively to find out the most informative instances without missing the potential changes happened anywhere in the instance space. Third, the informativeness of an instance is measured by a margin based metric, and it can effectively capture uncertain instances. Experimental results on real-world datasets demonstrate that the proposed approach can achieve good predictive accuracy on data streams.