用于优化MapReduce算法的Bloom过滤器及其变体:综述

F. Ezzaki, N. Abghour, A. Elomri, K. Moussaid, M. Rida
{"title":"用于优化MapReduce算法的Bloom过滤器及其变体:综述","authors":"F. Ezzaki, N. Abghour, A. Elomri, K. Moussaid, M. Rida","doi":"10.1109/CloudTech49835.2020.9365876","DOIUrl":null,"url":null,"abstract":"The bloom filter is a probabilistic data model used to test the existence of an element in a set, i.e., for any given item, the bloom filter could test the membership query on this candidate. The bloom filter has many advantages due to its simplicity and efficiency in highly solving the issue of data representation in many fields and to support membership queries, it has been known as space and time-efficient randomized data structure, by filtering out redundant data and optimizing the memory consumption. However, bloom filters are limited to membership tests and don’t support the deletion of elements. They also generate the false positive probability as they are based on a probabilistic model, this error rate is generated when an element that doesn’t belong to a set is considered as a member of this set by the bloom filter. Our goal is to compare a number of well- existed algorithms related to the boom filter for future work on the optimization of the join’s algorithms in MapReduce. This paper provides an overview of the different variants of the bloom filter and analyses the studies that have been interested in this area of research.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"93 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bloom filter and its variants for the optimization of MapReduce’s algorithms: A review\",\"authors\":\"F. Ezzaki, N. Abghour, A. Elomri, K. Moussaid, M. Rida\",\"doi\":\"10.1109/CloudTech49835.2020.9365876\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The bloom filter is a probabilistic data model used to test the existence of an element in a set, i.e., for any given item, the bloom filter could test the membership query on this candidate. The bloom filter has many advantages due to its simplicity and efficiency in highly solving the issue of data representation in many fields and to support membership queries, it has been known as space and time-efficient randomized data structure, by filtering out redundant data and optimizing the memory consumption. However, bloom filters are limited to membership tests and don’t support the deletion of elements. They also generate the false positive probability as they are based on a probabilistic model, this error rate is generated when an element that doesn’t belong to a set is considered as a member of this set by the bloom filter. Our goal is to compare a number of well- existed algorithms related to the boom filter for future work on the optimization of the join’s algorithms in MapReduce. This paper provides an overview of the different variants of the bloom filter and analyses the studies that have been interested in this area of research.\",\"PeriodicalId\":272860,\"journal\":{\"name\":\"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)\",\"volume\":\"93 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CloudTech49835.2020.9365876\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CloudTech49835.2020.9365876","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

布隆过滤器是一种概率数据模型,用于测试集合中某个元素的存在性,也就是说,对于任何给定的项目,布隆过滤器可以测试该候选项目的成员查询。布隆过滤器具有简单、高效的优点,能够很好地解决许多领域的数据表示问题,支持成员查询,通过过滤冗余数据和优化内存消耗,被称为空间和时间高效的随机数据结构。但是,布隆过滤器仅限于成员测试,不支持删除元素。它们也会产生假阳性概率,因为它们是基于概率模型的,这个错误率是当一个不属于集合的元素被布隆过滤器认为是这个集合的成员时产生的。我们的目标是比较一些现有的与boom filter相关的算法,以便将来在MapReduce中优化join算法。本文概述了布隆过滤器的不同变体,并分析了对这一研究领域感兴趣的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Bloom filter and its variants for the optimization of MapReduce’s algorithms: A review
The bloom filter is a probabilistic data model used to test the existence of an element in a set, i.e., for any given item, the bloom filter could test the membership query on this candidate. The bloom filter has many advantages due to its simplicity and efficiency in highly solving the issue of data representation in many fields and to support membership queries, it has been known as space and time-efficient randomized data structure, by filtering out redundant data and optimizing the memory consumption. However, bloom filters are limited to membership tests and don’t support the deletion of elements. They also generate the false positive probability as they are based on a probabilistic model, this error rate is generated when an element that doesn’t belong to a set is considered as a member of this set by the bloom filter. Our goal is to compare a number of well- existed algorithms related to the boom filter for future work on the optimization of the join’s algorithms in MapReduce. This paper provides an overview of the different variants of the bloom filter and analyses the studies that have been interested in this area of research.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信