{"title":"A cost model for estimating the performance of spatial joins using R-trees","authors":"Yun-Wu Huang, N. Jing, Elke A. Rundensteiner","doi":"10.1109/SSDM.1997.621148","DOIUrl":null,"url":null,"abstract":"The development of a cost model for predicting the performance of spatial joins has been identified in the literature as an important and difficult problem. The authors present the first cost model that can predict the performance of spatial joins using R-trees. Based on two existing R-trees (join targets), the model first estimates the number of expected I/Os for the join process by assuming a zero buffer size. The method for this estimation extends the cost model for R-tree window queries (developed by Kamel and Faloutsos (1993) and by Pagel et al. (1993)) to also handle spatial joins (which are more complex). In the context of spatial join processing, this number of zero-buffer expected I/Os is not practical for performance prediction in a buffered environment. To model the buffer impact, they use an (exponential) distribution function to measure the probability that a bufferless I/O would cause a page fault in a buffered environment. Based on this probability and the zero-buffer expected I/O cost, the estimated number of I/Os for an R-tree join can then be computed. The comparisons between the predictions from the cost model and the actual results from the experiments based on real GIS maps show that the average relative error ratio is about 10% with a maximum of about 20% for a wide range of buffer sizes. Therefore, our model is a useful tool for the query optimization of spatial join queries.","PeriodicalId":159935,"journal":{"name":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSDM.1997.621148","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 53
Abstract
The development of a cost model for predicting the performance of spatial joins has been identified in the literature as an important and difficult problem. The authors present the first cost model that can predict the performance of spatial joins using R-trees. Based on two existing R-trees (join targets), the model first estimates the number of expected I/Os for the join process by assuming a zero buffer size. The method for this estimation extends the cost model for R-tree window queries (developed by Kamel and Faloutsos (1993) and by Pagel et al. (1993)) to also handle spatial joins (which are more complex). In the context of spatial join processing, this number of zero-buffer expected I/Os is not practical for performance prediction in a buffered environment. To model the buffer impact, they use an (exponential) distribution function to measure the probability that a bufferless I/O would cause a page fault in a buffered environment. Based on this probability and the zero-buffer expected I/O cost, the estimated number of I/Os for an R-tree join can then be computed. The comparisons between the predictions from the cost model and the actual results from the experiments based on real GIS maps show that the average relative error ratio is about 10% with a maximum of about 20% for a wide range of buffer sizes. Therefore, our model is a useful tool for the query optimization of spatial join queries.