{"title":"Fault tolerant data flow using curator — Storm","authors":"Lavanya Sainik, Dheeraj Khajuria","doi":"10.1109/ICSESS.2014.6933608","DOIUrl":null,"url":null,"abstract":"Driven by the 3GPP (3rd Generation Partnership Project) evolving standards and advent of Big Data technology, to deal with huge volume, velocity and variety of data, various industries like telecommunication, warehousing and storage, financial and many more industries need to be compliant with this evolving technology. There is a huge demand to process both real time and stored data. In this paper we have analyzed an open source framework Storm, which is a real time distributed processing engine and suggesting an improvement on its fault tolerance mechanism so that it can be flawlessly used for any data processing use case. Vanilla storm provides guaranteed message processing however it promises “at least once” level of processing. For perfect fault tolerant system “exactly one” level of processing is required and to achieve this storm provides another framework, Trident which is built on top of it. Trident provides transactional spout where transactional metadata information <; transaction id, data > is stored in zookeeper which provides distributed coordination, thus across node / hardware data can be replayed in case of any failure, timeout, retry. Trident uses zookeeper for coordination of transactional information through apache curator framework. However with current trident framework per activity level (aggregator/reducer) commit can be easily obtained but no direct implementation for single chain level transaction commit. This paper describes an approach where by modifying existing transactional trident, chain level commit can be obtained using curator recipes.","PeriodicalId":6473,"journal":{"name":"2014 IEEE 5th International Conference on Software Engineering and Service Science","volume":"100 1","pages":"472-475"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 5th International Conference on Software Engineering and Service Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSESS.2014.6933608","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Driven by the 3GPP (3rd Generation Partnership Project) evolving standards and advent of Big Data technology, to deal with huge volume, velocity and variety of data, various industries like telecommunication, warehousing and storage, financial and many more industries need to be compliant with this evolving technology. There is a huge demand to process both real time and stored data. In this paper we have analyzed an open source framework Storm, which is a real time distributed processing engine and suggesting an improvement on its fault tolerance mechanism so that it can be flawlessly used for any data processing use case. Vanilla storm provides guaranteed message processing however it promises “at least once” level of processing. For perfect fault tolerant system “exactly one” level of processing is required and to achieve this storm provides another framework, Trident which is built on top of it. Trident provides transactional spout where transactional metadata information <; transaction id, data > is stored in zookeeper which provides distributed coordination, thus across node / hardware data can be replayed in case of any failure, timeout, retry. Trident uses zookeeper for coordination of transactional information through apache curator framework. However with current trident framework per activity level (aggregator/reducer) commit can be easily obtained but no direct implementation for single chain level transaction commit. This paper describes an approach where by modifying existing transactional trident, chain level commit can be obtained using curator recipes.