{"title":"MDMP: A new algorithm to create inverted index files in BigData, using MapReduce","authors":"Ahmad Arab, S. Abrishami","doi":"10.1109/ICCKE.2017.8167907","DOIUrl":null,"url":null,"abstract":"Generation of inverted index files has always been a fundamental issue in the realm of information retrieval and now it has turned to the most challenging issue in this area. Furthermore, the search engines need to continually produce these kinds of files in order to retrieve accurately and search data better. Therefore, the method to produce inverted index files is one of the main challenges and, in fact, the first one in the realm of IR. The first fundamental issue in this regard is generation time, followed by the volume of generated data. Given that sources of data generation and the volume of data are increasingly expanded, the concept of big data is highlighted in this aspect and the technology associated with this subject is applied. In this study, we try to present a new algorithm, based on MapReduce technology that is one way to process big data. Its main task is to generate inverted index files with decreased generation time and increased conducting speed, while exploiting effective resource application.","PeriodicalId":151934,"journal":{"name":"2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2017.8167907","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Generation of inverted index files has always been a fundamental issue in the realm of information retrieval and now it has turned to the most challenging issue in this area. Furthermore, the search engines need to continually produce these kinds of files in order to retrieve accurately and search data better. Therefore, the method to produce inverted index files is one of the main challenges and, in fact, the first one in the realm of IR. The first fundamental issue in this regard is generation time, followed by the volume of generated data. Given that sources of data generation and the volume of data are increasingly expanded, the concept of big data is highlighted in this aspect and the technology associated with this subject is applied. In this study, we try to present a new algorithm, based on MapReduce technology that is one way to process big data. Its main task is to generate inverted index files with decreased generation time and increased conducting speed, while exploiting effective resource application.