P. Reviriego, José Alberto Hernández, Zhenwei Dai, Anshumali Shrivastava
{"title":"Learned Bloom Filters in Adversarial Environments: A Malicious URL Detection Use-Case","authors":"P. Reviriego, José Alberto Hernández, Zhenwei Dai, Anshumali Shrivastava","doi":"10.1109/HPSR52026.2021.9481857","DOIUrl":null,"url":null,"abstract":"Learned Bloom Filters (LBFs) have been recently proposed as an alternative to traditional Bloom filters that can reduce the amount of memory needed to achieve a target false positive probability when representing a given set of elements. LBFs rely on Machine Learning models combined with traditional Bloom filters. However, if LBFs are going to be used as an alternative to Bloom filters, their security must be also be considered. In this paper, the security of LBFs is studied for the first time and a vulnerability different from those of traditional Bloom filters is uncovered. In more detail, an attacker can easily create a set of elements that are not in the filter with a much larger false positive probability than the target for which the filter has been designed. The constructed attack set can then be used to for example launch a denial of service attack against the system that uses the LBF. A malicious URL case study is used to illustrate the proposed attacks and show their effectiveness in increasing the false positive probability of LBFs. The dataset under consideration includes nearly 485K URLs where 16.47% of them are malicious URLs. Unfortunately, it seems that mitigating this vulnerability is not straightforward.","PeriodicalId":158580,"journal":{"name":"2021 IEEE 22nd International Conference on High Performance Switching and Routing (HPSR)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 22nd International Conference on High Performance Switching and Routing (HPSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPSR52026.2021.9481857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Learned Bloom Filters (LBFs) have been recently proposed as an alternative to traditional Bloom filters that can reduce the amount of memory needed to achieve a target false positive probability when representing a given set of elements. LBFs rely on Machine Learning models combined with traditional Bloom filters. However, if LBFs are going to be used as an alternative to Bloom filters, their security must be also be considered. In this paper, the security of LBFs is studied for the first time and a vulnerability different from those of traditional Bloom filters is uncovered. In more detail, an attacker can easily create a set of elements that are not in the filter with a much larger false positive probability than the target for which the filter has been designed. The constructed attack set can then be used to for example launch a denial of service attack against the system that uses the LBF. A malicious URL case study is used to illustrate the proposed attacks and show their effectiveness in increasing the false positive probability of LBFs. The dataset under consideration includes nearly 485K URLs where 16.47% of them are malicious URLs. Unfortunately, it seems that mitigating this vulnerability is not straightforward.