MDMP: A new algorithm to create inverted index files in BigData, using MapReduce

Ahmad Arab, S. Abrishami
{"title":"MDMP: A new algorithm to create inverted index files in BigData, using MapReduce","authors":"Ahmad Arab, S. Abrishami","doi":"10.1109/ICCKE.2017.8167907","DOIUrl":null,"url":null,"abstract":"Generation of inverted index files has always been a fundamental issue in the realm of information retrieval and now it has turned to the most challenging issue in this area. Furthermore, the search engines need to continually produce these kinds of files in order to retrieve accurately and search data better. Therefore, the method to produce inverted index files is one of the main challenges and, in fact, the first one in the realm of IR. The first fundamental issue in this regard is generation time, followed by the volume of generated data. Given that sources of data generation and the volume of data are increasingly expanded, the concept of big data is highlighted in this aspect and the technology associated with this subject is applied. In this study, we try to present a new algorithm, based on MapReduce technology that is one way to process big data. Its main task is to generate inverted index files with decreased generation time and increased conducting speed, while exploiting effective resource application.","PeriodicalId":151934,"journal":{"name":"2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2017.8167907","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Generation of inverted index files has always been a fundamental issue in the realm of information retrieval and now it has turned to the most challenging issue in this area. Furthermore, the search engines need to continually produce these kinds of files in order to retrieve accurately and search data better. Therefore, the method to produce inverted index files is one of the main challenges and, in fact, the first one in the realm of IR. The first fundamental issue in this regard is generation time, followed by the volume of generated data. Given that sources of data generation and the volume of data are increasingly expanded, the concept of big data is highlighted in this aspect and the technology associated with this subject is applied. In this study, we try to present a new algorithm, based on MapReduce technology that is one way to process big data. Its main task is to generate inverted index files with decreased generation time and increased conducting speed, while exploiting effective resource application.
MDMP:使用MapReduce在大数据中创建倒排索引文件的新算法
倒排索引文件的生成一直是信息检索领域的基础性问题,目前已成为该领域最具挑战性的问题。此外,搜索引擎需要不断地生成这些类型的文件,以便准确地检索和更好地搜索数据。因此,生成倒排索引文件的方法是主要的挑战之一,实际上也是IR领域中的第一个挑战。这方面的第一个基本问题是生成时间,其次是生成的数据量。随着数据产生来源和数据量的日益扩大,大数据的概念在这方面得到了突出的体现,并应用了与该主题相关的技术。在本研究中,我们尝试提出一种基于MapReduce技术的新算法,这是处理大数据的一种方法。它的主要任务是在减少生成时间和提高执行速度的同时,有效地利用资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信