{"title":"动态度量索引中的延迟插入策略","authors":"Edgar Chávez, Nora Reyes, Patricia Roggero","doi":"10.1109/SCCC.2009.23","DOIUrl":null,"url":null,"abstract":"Dynamic data structures are sensitive to insertion order, particularly tree-based data structures. In this paper we present a buffering heuristic allowing delayed root selection (when enough data has arrived to have valid statistics) useful for hierarchical indexes. Initially, when less than $M$ objects have been inserted queries are answered from the buffer itself using an online-friendly algorithm which can be simulated by AESA (Approximating and Eliminating Search Algorithm) or can be implemented with the dynamic data structure being optimized. When the buffer is full the tree root can be selected in a more informed way using the distances between the $M$ objects in the buffer. Buffering has an additional usage, multiple routing strategies can be designed depending on statistics of the query. A complete picture of the technique includes doing a recursive best-root selection with much more parameters. We focus on the Dynamic Spatial Approximation Tree ({\\em DSAT}) investigating the improvement obtained in the first level of the tree (the root and its children). Notice that if the buffering strategy is repeated recursively we can obtain a boosting on the performance when the data structure reaches a stable state. For this reason even a very small improvement in performance is significant. We present a systematic improvement in the query complexity for several real time, publicly available data sets from the SISAP repository with our buffering strategies.","PeriodicalId":398661,"journal":{"name":"2009 International Conference of the Chilean Computer Science Society","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Delayed Insertion Strategies in Dynamic Metric Indexes\",\"authors\":\"Edgar Chávez, Nora Reyes, Patricia Roggero\",\"doi\":\"10.1109/SCCC.2009.23\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dynamic data structures are sensitive to insertion order, particularly tree-based data structures. In this paper we present a buffering heuristic allowing delayed root selection (when enough data has arrived to have valid statistics) useful for hierarchical indexes. Initially, when less than $M$ objects have been inserted queries are answered from the buffer itself using an online-friendly algorithm which can be simulated by AESA (Approximating and Eliminating Search Algorithm) or can be implemented with the dynamic data structure being optimized. When the buffer is full the tree root can be selected in a more informed way using the distances between the $M$ objects in the buffer. Buffering has an additional usage, multiple routing strategies can be designed depending on statistics of the query. A complete picture of the technique includes doing a recursive best-root selection with much more parameters. We focus on the Dynamic Spatial Approximation Tree ({\\\\em DSAT}) investigating the improvement obtained in the first level of the tree (the root and its children). Notice that if the buffering strategy is repeated recursively we can obtain a boosting on the performance when the data structure reaches a stable state. For this reason even a very small improvement in performance is significant. We present a systematic improvement in the query complexity for several real time, publicly available data sets from the SISAP repository with our buffering strategies.\",\"PeriodicalId\":398661,\"journal\":{\"name\":\"2009 International Conference of the Chilean Computer Science Society\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 International Conference of the Chilean Computer Science Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCCC.2009.23\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference of the Chilean Computer Science Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCCC.2009.23","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Delayed Insertion Strategies in Dynamic Metric Indexes
Dynamic data structures are sensitive to insertion order, particularly tree-based data structures. In this paper we present a buffering heuristic allowing delayed root selection (when enough data has arrived to have valid statistics) useful for hierarchical indexes. Initially, when less than $M$ objects have been inserted queries are answered from the buffer itself using an online-friendly algorithm which can be simulated by AESA (Approximating and Eliminating Search Algorithm) or can be implemented with the dynamic data structure being optimized. When the buffer is full the tree root can be selected in a more informed way using the distances between the $M$ objects in the buffer. Buffering has an additional usage, multiple routing strategies can be designed depending on statistics of the query. A complete picture of the technique includes doing a recursive best-root selection with much more parameters. We focus on the Dynamic Spatial Approximation Tree ({\em DSAT}) investigating the improvement obtained in the first level of the tree (the root and its children). Notice that if the buffering strategy is repeated recursively we can obtain a boosting on the performance when the data structure reaches a stable state. For this reason even a very small improvement in performance is significant. We present a systematic improvement in the query complexity for several real time, publicly available data sets from the SISAP repository with our buffering strategies.