{"title":"提高PLWAH位图算法的查询效率","authors":"Benjamin Taufen, Jason Sawin, David Chiu","doi":"10.1145/3105831.3105868","DOIUrl":null,"url":null,"abstract":"Bitmap indices are commonly used for accessing large, read-only data. A bitmap is a simplified model of the underlying data in secondary storage. Its coarse representation enables the use of fast CPU operations to answer common database queries. Additionally, bitmaps are very compressible. Several known compression algorithms allow the compressed form of the bitmap to be queried directly, and one of which is Position List Word-Aligned Hybrid (PLWAH). PLWAH is modified hybrid run-length encoding scheme that can achieve better compression than traditional schemes such as Word-Aligned Hybrid (WAH). This improved compression introduces an increased query processing cost, of which we address in this paper. We present a technique that uses metadata to allow PLWAH's query algorithm to exploit logical short-circuiting opportunities, reducing the cost of certain queries. In our empirical study, we found that our approach achieved an average speedup of 1.41x over PLWAH for real scientific data sets. For specific queries, our approach realized speedups as high as 8000x.","PeriodicalId":319729,"journal":{"name":"Proceedings of the 21st International Database Engineering & Applications Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improving the Querying Efficiency of the PLWAH Bitmap Algorithm\",\"authors\":\"Benjamin Taufen, Jason Sawin, David Chiu\",\"doi\":\"10.1145/3105831.3105868\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bitmap indices are commonly used for accessing large, read-only data. A bitmap is a simplified model of the underlying data in secondary storage. Its coarse representation enables the use of fast CPU operations to answer common database queries. Additionally, bitmaps are very compressible. Several known compression algorithms allow the compressed form of the bitmap to be queried directly, and one of which is Position List Word-Aligned Hybrid (PLWAH). PLWAH is modified hybrid run-length encoding scheme that can achieve better compression than traditional schemes such as Word-Aligned Hybrid (WAH). This improved compression introduces an increased query processing cost, of which we address in this paper. We present a technique that uses metadata to allow PLWAH's query algorithm to exploit logical short-circuiting opportunities, reducing the cost of certain queries. In our empirical study, we found that our approach achieved an average speedup of 1.41x over PLWAH for real scientific data sets. For specific queries, our approach realized speedups as high as 8000x.\",\"PeriodicalId\":319729,\"journal\":{\"name\":\"Proceedings of the 21st International Database Engineering & Applications Symposium\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 21st International Database Engineering & Applications Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3105831.3105868\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st International Database Engineering & Applications Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3105831.3105868","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving the Querying Efficiency of the PLWAH Bitmap Algorithm
Bitmap indices are commonly used for accessing large, read-only data. A bitmap is a simplified model of the underlying data in secondary storage. Its coarse representation enables the use of fast CPU operations to answer common database queries. Additionally, bitmaps are very compressible. Several known compression algorithms allow the compressed form of the bitmap to be queried directly, and one of which is Position List Word-Aligned Hybrid (PLWAH). PLWAH is modified hybrid run-length encoding scheme that can achieve better compression than traditional schemes such as Word-Aligned Hybrid (WAH). This improved compression introduces an increased query processing cost, of which we address in this paper. We present a technique that uses metadata to allow PLWAH's query algorithm to exploit logical short-circuiting opportunities, reducing the cost of certain queries. In our empirical study, we found that our approach achieved an average speedup of 1.41x over PLWAH for real scientific data sets. For specific queries, our approach realized speedups as high as 8000x.