A. M. Mustafa, Ahsanul Haque, L. Khan, M. Baron, B. Thuraisingham
{"title":"使用变更检测进化流分类","authors":"A. M. Mustafa, Ahsanul Haque, L. Khan, M. Baron, B. Thuraisingham","doi":"10.4108/ICST.COLLABORATECOM.2014.257769","DOIUrl":null,"url":null,"abstract":"Classifying instances in evolving data stream is a challenging task because of its properties, e.g., infinite length, concept drift, and concept evolution. Most of the currently available approaches to classify stream data instances divide the stream data into fixed size chunks to fit the data in memory and process the fixed size chunk one after another. However, this may lead to failure of capturing the concept drift immediately. We try to determine the chunk size dynamically by exploiting change point detection (CPD) techniques on stream data. In general, the distribution families before and after the change point are unknown over the stream, therefore non-parametric CPD algorithms are used in this case. We propose a multi-dimensional non-parametric CPD technique to determine chunk boundary over data streams dynamically which leads to better models to classify instances of evolving data streams. Experimental results show that our approach can detect the change points and classify instances of evolving data stream with high accuracy as compared to other baseline approaches.","PeriodicalId":432345,"journal":{"name":"10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Evolving stream classification using change detection\",\"authors\":\"A. M. Mustafa, Ahsanul Haque, L. Khan, M. Baron, B. Thuraisingham\",\"doi\":\"10.4108/ICST.COLLABORATECOM.2014.257769\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Classifying instances in evolving data stream is a challenging task because of its properties, e.g., infinite length, concept drift, and concept evolution. Most of the currently available approaches to classify stream data instances divide the stream data into fixed size chunks to fit the data in memory and process the fixed size chunk one after another. However, this may lead to failure of capturing the concept drift immediately. We try to determine the chunk size dynamically by exploiting change point detection (CPD) techniques on stream data. In general, the distribution families before and after the change point are unknown over the stream, therefore non-parametric CPD algorithms are used in this case. We propose a multi-dimensional non-parametric CPD technique to determine chunk boundary over data streams dynamically which leads to better models to classify instances of evolving data streams. Experimental results show that our approach can detect the change points and classify instances of evolving data stream with high accuracy as compared to other baseline approaches.\",\"PeriodicalId\":432345,\"journal\":{\"name\":\"10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4108/ICST.COLLABORATECOM.2014.257769\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4108/ICST.COLLABORATECOM.2014.257769","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Evolving stream classification using change detection
Classifying instances in evolving data stream is a challenging task because of its properties, e.g., infinite length, concept drift, and concept evolution. Most of the currently available approaches to classify stream data instances divide the stream data into fixed size chunks to fit the data in memory and process the fixed size chunk one after another. However, this may lead to failure of capturing the concept drift immediately. We try to determine the chunk size dynamically by exploiting change point detection (CPD) techniques on stream data. In general, the distribution families before and after the change point are unknown over the stream, therefore non-parametric CPD algorithms are used in this case. We propose a multi-dimensional non-parametric CPD technique to determine chunk boundary over data streams dynamically which leads to better models to classify instances of evolving data streams. Experimental results show that our approach can detect the change points and classify instances of evolving data stream with high accuracy as compared to other baseline approaches.