{"title":"Comparative Study of Big Data Frameworks","authors":"H. K. Gupta, Dr. Rafat Parveen","doi":"10.1109/ICICT46931.2019.8977680","DOIUrl":null,"url":null,"abstract":"We are really living in ever growing volume of data production. The huge amount of data in terabyte and petabytes are generating in real word and it is a challenging task to access, storage, analysis of all structured, unstructured and semi structured heterogeneous and complex data, also traditional tools is not suitable towards distributed and real-time processing. We need an efficient framework for processing such heterogeneous data and transform it into optimized meaningful information. There are many frameworks for distributed computing has been developed to perform huge amount of data processing. Hadoop Map Reduce is the extensively used framework because of its scalability, security, latency and efficiency, and reliability. The intension of this paper is to relative study of common framework such as Hadoop, Spark, Flink, Samza and Storm.","PeriodicalId":412668,"journal":{"name":"2019 International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICT46931.2019.8977680","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
We are really living in ever growing volume of data production. The huge amount of data in terabyte and petabytes are generating in real word and it is a challenging task to access, storage, analysis of all structured, unstructured and semi structured heterogeneous and complex data, also traditional tools is not suitable towards distributed and real-time processing. We need an efficient framework for processing such heterogeneous data and transform it into optimized meaningful information. There are many frameworks for distributed computing has been developed to perform huge amount of data processing. Hadoop Map Reduce is the extensively used framework because of its scalability, security, latency and efficiency, and reliability. The intension of this paper is to relative study of common framework such as Hadoop, Spark, Flink, Samza and Storm.