{"title":"Big Data retrieval techniques based on Hash Indexing and MapReduce approach with NoSQL Database","authors":"N. Gayathiri, D. D. Jaspher, A. Natarajan","doi":"10.1109/ICACCE46606.2019.9079964","DOIUrl":null,"url":null,"abstract":"As the size of the data grows enormous day by day, there are challenges in storing, sorting and quick accessibility of the data. In order to overcome these challenges indexing of Big Data were made predominant so that these data can be ordered, addressed and located easily. Though there are lot of techniques to index data and map them, each has its own advantages and issues over its performance across various kinds of data. Two different techniques for Big Data retrieval namely MapReduce, a way of simplifying a huge collection into some useful aggregation values and Hash indexing, which is a method of generating key and storing the value of the tuples so that the data are addressed by the generated key on its tuples is compared using NoSQL database. An analysis is made to examine the retrieval efficiency of the data which are of varying size from the whole dataset and limiting the data to be retrieved using predicates through search queries is performed. The comparison is made using both singleton and distributed NoSQL MongoDB.","PeriodicalId":317123,"journal":{"name":"2019 International Conference on Advances in Computing and Communication Engineering (ICACCE)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Advances in Computing and Communication Engineering (ICACCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACCE46606.2019.9079964","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
As the size of the data grows enormous day by day, there are challenges in storing, sorting and quick accessibility of the data. In order to overcome these challenges indexing of Big Data were made predominant so that these data can be ordered, addressed and located easily. Though there are lot of techniques to index data and map them, each has its own advantages and issues over its performance across various kinds of data. Two different techniques for Big Data retrieval namely MapReduce, a way of simplifying a huge collection into some useful aggregation values and Hash indexing, which is a method of generating key and storing the value of the tuples so that the data are addressed by the generated key on its tuples is compared using NoSQL database. An analysis is made to examine the retrieval efficiency of the data which are of varying size from the whole dataset and limiting the data to be retrieved using predicates through search queries is performed. The comparison is made using both singleton and distributed NoSQL MongoDB.