Alexandru Iacob, L. Itu, L. Sasu, F. Moldoveanu, C. Suciu
{"title":"GPU加速信息检索使用布隆过滤器","authors":"Alexandru Iacob, L. Itu, L. Sasu, F. Moldoveanu, C. Suciu","doi":"10.1109/ICSTCC.2015.7321404","DOIUrl":null,"url":null,"abstract":"Information retrieval is a technique used in search engines, advertisement placement and cognitive databases. With increasing amounts of data and stringent response time requirements, improving the underlying implementation of document retrieval becomes critical. To this end, we consider a Bloom filter, a simple randomized data structure that answers membership queries with no false negative and customizable false positive probability. Mainly, we focus on the speed-up of the algorithm by using a Graphics Processing Units (GPU) based implementation. Starting from a regular CPU implementation of the Bloom filter algorithm, we employ different optimization techniques on the two basic Bloom filter operations: mapping and querying. An important speed-up is achieved for both operations: over 300x for mapping, and over 20x for querying. Furthermore, we show that the number of hash functions used during the mapping operation, the number of files, and the number of query words have a significant effect on the execution time and the speed-up.","PeriodicalId":257135,"journal":{"name":"2015 19th International Conference on System Theory, Control and Computing (ICSTCC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"GPU accelerated information retrieval using Bloom filters\",\"authors\":\"Alexandru Iacob, L. Itu, L. Sasu, F. Moldoveanu, C. Suciu\",\"doi\":\"10.1109/ICSTCC.2015.7321404\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Information retrieval is a technique used in search engines, advertisement placement and cognitive databases. With increasing amounts of data and stringent response time requirements, improving the underlying implementation of document retrieval becomes critical. To this end, we consider a Bloom filter, a simple randomized data structure that answers membership queries with no false negative and customizable false positive probability. Mainly, we focus on the speed-up of the algorithm by using a Graphics Processing Units (GPU) based implementation. Starting from a regular CPU implementation of the Bloom filter algorithm, we employ different optimization techniques on the two basic Bloom filter operations: mapping and querying. An important speed-up is achieved for both operations: over 300x for mapping, and over 20x for querying. Furthermore, we show that the number of hash functions used during the mapping operation, the number of files, and the number of query words have a significant effect on the execution time and the speed-up.\",\"PeriodicalId\":257135,\"journal\":{\"name\":\"2015 19th International Conference on System Theory, Control and Computing (ICSTCC)\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 19th International Conference on System Theory, Control and Computing (ICSTCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSTCC.2015.7321404\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 19th International Conference on System Theory, Control and Computing (ICSTCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSTCC.2015.7321404","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
GPU accelerated information retrieval using Bloom filters
Information retrieval is a technique used in search engines, advertisement placement and cognitive databases. With increasing amounts of data and stringent response time requirements, improving the underlying implementation of document retrieval becomes critical. To this end, we consider a Bloom filter, a simple randomized data structure that answers membership queries with no false negative and customizable false positive probability. Mainly, we focus on the speed-up of the algorithm by using a Graphics Processing Units (GPU) based implementation. Starting from a regular CPU implementation of the Bloom filter algorithm, we employ different optimization techniques on the two basic Bloom filter operations: mapping and querying. An important speed-up is achieved for both operations: over 300x for mapping, and over 20x for querying. Furthermore, we show that the number of hash functions used during the mapping operation, the number of files, and the number of query words have a significant effect on the execution time and the speed-up.