{"title":"Hadoop Approach to Cluster Based Cache Oblivious Peano Curves","authors":"Gurpinder Kaur, Sachin Bagga, K. Mann","doi":"10.1109/IACC.2017.0037","DOIUrl":null,"url":null,"abstract":"Hadoop is one of the most popular technologies used in the big data landscape for evaluating the data through Hadoop Distributed File System and Map-Reduce. Problems which are larger in size are becoming tough to handle by a single system these days because the execution time for such problems will be very high in such platform. Instead of processing the tasks in a sequential approach, when the processing is done in parallel through the MapReduce method, then results with better efficiency can be expected. In the present method, firstly the Map task decomposes the input into the intermediate keys and then the intermediate keys are sent to the reduce function for processing of data. The algorithm used for performing matrix multiplication is cache oblivious in nature, for better utilization of the memory hierarchy. Processing with the cache oblivious approach increases the re-usability power of the elements and thus decreases the overall execution time. The proposed work for matrix multiplication shall be fault tolerant in nature as there is a replication of data at three places on three different data nodes.","PeriodicalId":248433,"journal":{"name":"2017 IEEE 7th International Advance Computing Conference (IACC)","volume":" 51","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 7th International Advance Computing Conference (IACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IACC.2017.0037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Hadoop is one of the most popular technologies used in the big data landscape for evaluating the data through Hadoop Distributed File System and Map-Reduce. Problems which are larger in size are becoming tough to handle by a single system these days because the execution time for such problems will be very high in such platform. Instead of processing the tasks in a sequential approach, when the processing is done in parallel through the MapReduce method, then results with better efficiency can be expected. In the present method, firstly the Map task decomposes the input into the intermediate keys and then the intermediate keys are sent to the reduce function for processing of data. The algorithm used for performing matrix multiplication is cache oblivious in nature, for better utilization of the memory hierarchy. Processing with the cache oblivious approach increases the re-usability power of the elements and thus decreases the overall execution time. The proposed work for matrix multiplication shall be fault tolerant in nature as there is a replication of data at three places on three different data nodes.