{"title":"基于Persistent MapReduce技术的高效批量处理相关大数据任务","authors":"R. K. Sidhu, Charanjiv Singh Saroa","doi":"10.1145/2983402.2983431","DOIUrl":null,"url":null,"abstract":"The data generated by today's enterprises has been increasing at exponential rates in size from most recent couple of years. Also, the need to process and break down the substantial volumes of data has likewise expanded. In order to handle this enormous amount of data and to analyze the same, an open-source usage of Apache system, Hadoop is utilized now-a-days. Hadoop presented a utility computing model which offer replacement of traditional databases and processing techniques. Scalability and high availability of MapReduce makes it the first choice for big data analysis. This paper provides a brief introduction to HDFS and MapReduce. After studying them in detail, it later made to work on related tasks and store the cached result of mapper function which can be used as an input for general reducers. By this additional triggering agent, we were able to achieve the analysis result in approximately half the actual time.","PeriodicalId":283626,"journal":{"name":"Proceedings of the Third International Symposium on Computer Vision and the Internet","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Batch Processing of Related Big Data Tasks using Persistent MapReduce Technique\",\"authors\":\"R. K. Sidhu, Charanjiv Singh Saroa\",\"doi\":\"10.1145/2983402.2983431\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The data generated by today's enterprises has been increasing at exponential rates in size from most recent couple of years. Also, the need to process and break down the substantial volumes of data has likewise expanded. In order to handle this enormous amount of data and to analyze the same, an open-source usage of Apache system, Hadoop is utilized now-a-days. Hadoop presented a utility computing model which offer replacement of traditional databases and processing techniques. Scalability and high availability of MapReduce makes it the first choice for big data analysis. This paper provides a brief introduction to HDFS and MapReduce. After studying them in detail, it later made to work on related tasks and store the cached result of mapper function which can be used as an input for general reducers. By this additional triggering agent, we were able to achieve the analysis result in approximately half the actual time.\",\"PeriodicalId\":283626,\"journal\":{\"name\":\"Proceedings of the Third International Symposium on Computer Vision and the Internet\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Third International Symposium on Computer Vision and the Internet\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2983402.2983431\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third International Symposium on Computer Vision and the Internet","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2983402.2983431","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient Batch Processing of Related Big Data Tasks using Persistent MapReduce Technique
The data generated by today's enterprises has been increasing at exponential rates in size from most recent couple of years. Also, the need to process and break down the substantial volumes of data has likewise expanded. In order to handle this enormous amount of data and to analyze the same, an open-source usage of Apache system, Hadoop is utilized now-a-days. Hadoop presented a utility computing model which offer replacement of traditional databases and processing techniques. Scalability and high availability of MapReduce makes it the first choice for big data analysis. This paper provides a brief introduction to HDFS and MapReduce. After studying them in detail, it later made to work on related tasks and store the cached result of mapper function which can be used as an input for general reducers. By this additional triggering agent, we were able to achieve the analysis result in approximately half the actual time.