{"title":"Effective data stream mining using ensemble on cloud with load balancing (E2CL)","authors":"Jagadheeswaran Kathirvel, E. Parasuraman","doi":"10.1109/ICCCT2.2015.7292780","DOIUrl":null,"url":null,"abstract":"Data stream is generated everywhere with ever increasing speed. There is a need for efficient stream processing systems and optimal algorithms to mine all items of these streams to accurately predict the knowledge in limited time. In the existing approaches, there are some limitations like one-pass, sampling and load shedding on processing the streams which trade-off in accuracy. There are some approaches which use the distributed computing, grid computing and cloud computing technologies to deal with these challenges. This paper proposes a new approach to reduce the overhead of processing the already processed items. In this approach there will be a central system called model aggregator that will pull the learnt knowledge from all the stream processing systems, combine those knowledge and then will push to all the cloud processing systems in certain time interval. Having this combined knowledge, the participating stream processing systems' overhead is reduced that will increase the availability of the systems to handle the additional streams. Also since the cloud systems can be provisioned in advance or on-demand when the peak streaming occurs, the window dropping can be avoided.","PeriodicalId":410045,"journal":{"name":"2015 International Conference on Computing and Communications Technologies (ICCCT)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Computing and Communications Technologies (ICCCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCT2.2015.7292780","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Data stream is generated everywhere with ever increasing speed. There is a need for efficient stream processing systems and optimal algorithms to mine all items of these streams to accurately predict the knowledge in limited time. In the existing approaches, there are some limitations like one-pass, sampling and load shedding on processing the streams which trade-off in accuracy. There are some approaches which use the distributed computing, grid computing and cloud computing technologies to deal with these challenges. This paper proposes a new approach to reduce the overhead of processing the already processed items. In this approach there will be a central system called model aggregator that will pull the learnt knowledge from all the stream processing systems, combine those knowledge and then will push to all the cloud processing systems in certain time interval. Having this combined knowledge, the participating stream processing systems' overhead is reduced that will increase the availability of the systems to handle the additional streams. Also since the cloud systems can be provisioned in advance or on-demand when the peak streaming occurs, the window dropping can be avoided.