Zhen'an Zhang, Dongjie Zhang, Xiaopeng Yu, Jing Wang, Chunjiang He, Pingpeng Yuan, Hai Jin
{"title":"Alovera: A Fast Stream Processing System for Large-Scale Data","authors":"Zhen'an Zhang, Dongjie Zhang, Xiaopeng Yu, Jing Wang, Chunjiang He, Pingpeng Yuan, Hai Jin","doi":"10.1109/CHINAGRID.2013.9","DOIUrl":null,"url":null,"abstract":"Growing of data volume poses challenges to data processing system. In this paper, Alovera, a fast stream processing system for large-scale data is presented. By using columnar data layout and stream processing, it is capable of pipelining data processing efficiently. It can process part of data instead of waiting for all data to be ready for the next operation. Thus, it can reduce the query time dramatically. Experimental results indicate significant performance improvement in a variety of tasks. In the experiments, we also evaluate our methods with different systems including HadoopDB and Hive. The extensive experiments confirm efficiency and better performance of our system.","PeriodicalId":251153,"journal":{"name":"2013 8th ChinaGrid Annual Conference","volume":"206 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 8th ChinaGrid Annual Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CHINAGRID.2013.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Growing of data volume poses challenges to data processing system. In this paper, Alovera, a fast stream processing system for large-scale data is presented. By using columnar data layout and stream processing, it is capable of pipelining data processing efficiently. It can process part of data instead of waiting for all data to be ready for the next operation. Thus, it can reduce the query time dramatically. Experimental results indicate significant performance improvement in a variety of tasks. In the experiments, we also evaluate our methods with different systems including HadoopDB and Hive. The extensive experiments confirm efficiency and better performance of our system.