Sangchul Kim, Seoung Gook Sohn, Taehoon Kim, Jinseon Yu, Bogyeong Kim, Bongki Moon
{"title":"选择性扫描筛选算子的SciDB","authors":"Sangchul Kim, Seoung Gook Sohn, Taehoon Kim, Jinseon Yu, Bogyeong Kim, Bongki Moon","doi":"10.1145/2949689.2949707","DOIUrl":null,"url":null,"abstract":"Recently there has been an increasing interest in analyzing scientific data generated by observations and scientific experiments. For managing these data efficiently, SciDB, a multi-dimensional array-based DBMS, is suggested. When SciDB processes a query with where predicates, it uses filter operator internally to produce a result array that matches the predicates. Most queries for scientific data analysis utilize spatial information. However, filter operator of SciDB reads all data without considering features of array-based DBMSs and spatial information. In this demo, we present an efficient query processing scheme utilizing characteristics of array-based data, implemented by employing coordinates. It uses a selective scan that retrieves data corresponding to a range that satisfies specific conditions. In our experiments, the selective scan is up to 30x faster than the original scan. We demonstrate that our implementation of the filter operator will reduce the processing time of a selection query significantly and enable SciDB to handle a massive amount of scientific data in more scalable manner.","PeriodicalId":254803,"journal":{"name":"Proceedings of the 28th International Conference on Scientific and Statistical Database Management","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Selective Scan for Filter Operator of SciDB\",\"authors\":\"Sangchul Kim, Seoung Gook Sohn, Taehoon Kim, Jinseon Yu, Bogyeong Kim, Bongki Moon\",\"doi\":\"10.1145/2949689.2949707\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently there has been an increasing interest in analyzing scientific data generated by observations and scientific experiments. For managing these data efficiently, SciDB, a multi-dimensional array-based DBMS, is suggested. When SciDB processes a query with where predicates, it uses filter operator internally to produce a result array that matches the predicates. Most queries for scientific data analysis utilize spatial information. However, filter operator of SciDB reads all data without considering features of array-based DBMSs and spatial information. In this demo, we present an efficient query processing scheme utilizing characteristics of array-based data, implemented by employing coordinates. It uses a selective scan that retrieves data corresponding to a range that satisfies specific conditions. In our experiments, the selective scan is up to 30x faster than the original scan. We demonstrate that our implementation of the filter operator will reduce the processing time of a selection query significantly and enable SciDB to handle a massive amount of scientific data in more scalable manner.\",\"PeriodicalId\":254803,\"journal\":{\"name\":\"Proceedings of the 28th International Conference on Scientific and Statistical Database Management\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 28th International Conference on Scientific and Statistical Database Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2949689.2949707\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 28th International Conference on Scientific and Statistical Database Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2949689.2949707","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Recently there has been an increasing interest in analyzing scientific data generated by observations and scientific experiments. For managing these data efficiently, SciDB, a multi-dimensional array-based DBMS, is suggested. When SciDB processes a query with where predicates, it uses filter operator internally to produce a result array that matches the predicates. Most queries for scientific data analysis utilize spatial information. However, filter operator of SciDB reads all data without considering features of array-based DBMSs and spatial information. In this demo, we present an efficient query processing scheme utilizing characteristics of array-based data, implemented by employing coordinates. It uses a selective scan that retrieves data corresponding to a range that satisfies specific conditions. In our experiments, the selective scan is up to 30x faster than the original scan. We demonstrate that our implementation of the filter operator will reduce the processing time of a selection query significantly and enable SciDB to handle a massive amount of scientific data in more scalable manner.