A new R-tree node splitting algorithm using MBR partition policy

2009 17th International Conference on Geoinformatics Pub Date : 2009-10-23 DOI:10.1109/GEOINFORMATICS.2009.5293260

Yan Liu, Jinyun Fang, Chengde Han

{"title":"A new R-tree node splitting algorithm using MBR partition policy","authors":"Yan Liu, Jinyun Fang, Chengde Han","doi":"10.1109/GEOINFORMATICS.2009.5293260","DOIUrl":null,"url":null,"abstract":"In this paper we introduced a new R-tree node splitting algorithm. As an indexing technique for multi-dimensional data, R-tree is widely used in geographical information systems, CAD systems and spatial databases. An R-tree consists of nodes which in turn consist of records. Each node in R-tree must contain limited number of records in order that it can be stored within one disk block, thus a node splitting algorithm is used while inserting a new record into a full node. A node splitting algorithm is one of the crucial factors of the query performance of an R-tree since bad splits would result in an inefficient R-tree structure. At first we gave an efficient, linear node splitting algorithm that could construct an R-tree fast, by partitioning the minimum bounding rectangle of the node according to the shape of the node's MBR and the shapes of the MBRs of the node's records. Then we found that this node splitting algorithm would generate uneven nodes sometimes, that is the node to be split might be split into two nodes with one of them containing less records than the minimum number of records required. We then developed an algorithm to balance those uneven splitting results to meet the demands of the R-tree definition. At Last we improved our node splitting algorithm by considering the siblings of the splitting node. The siblings of the splitting node were picked up during the phase of choosing node to insert record and were put together with the splitting node to generate a better result. We performed several experiments to compare our node splitting algorithms with some other node splitting algorithms on both synthetic and real world data. These experiments tested the tree-construction costs and the query performance of the resulting R-trees. The results showed that the tree-constructing cost of our algorithm was lower than others and although the balancing procedure degraded the performance, our algorithm outperformed other node splitting algorithms in queries of the resulting R-trees.","PeriodicalId":121212,"journal":{"name":"2009 17th International Conference on Geoinformatics","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 17th International Conference on Geoinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GEOINFORMATICS.2009.5293260","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

Abstract

In this paper we introduced a new R-tree node splitting algorithm. As an indexing technique for multi-dimensional data, R-tree is widely used in geographical information systems, CAD systems and spatial databases. An R-tree consists of nodes which in turn consist of records. Each node in R-tree must contain limited number of records in order that it can be stored within one disk block, thus a node splitting algorithm is used while inserting a new record into a full node. A node splitting algorithm is one of the crucial factors of the query performance of an R-tree since bad splits would result in an inefficient R-tree structure. At first we gave an efficient, linear node splitting algorithm that could construct an R-tree fast, by partitioning the minimum bounding rectangle of the node according to the shape of the node's MBR and the shapes of the MBRs of the node's records. Then we found that this node splitting algorithm would generate uneven nodes sometimes, that is the node to be split might be split into two nodes with one of them containing less records than the minimum number of records required. We then developed an algorithm to balance those uneven splitting results to meet the demands of the R-tree definition. At Last we improved our node splitting algorithm by considering the siblings of the splitting node. The siblings of the splitting node were picked up during the phase of choosing node to insert record and were put together with the splitting node to generate a better result. We performed several experiments to compare our node splitting algorithms with some other node splitting algorithms on both synthetic and real world data. These experiments tested the tree-construction costs and the query performance of the resulting R-trees. The results showed that the tree-constructing cost of our algorithm was lower than others and although the balancing procedure degraded the performance, our algorithm outperformed other node splitting algorithms in queries of the resulting R-trees.

查看原文本刊更多论文

一种新的基于MBR分区策略的r树节点分割算法

本文提出了一种新的r树节点分割算法。R-tree作为一种多维数据的索引技术，广泛应用于地理信息系统、CAD系统和空间数据库中。r树由节点组成，节点又由记录组成。r树中的每个节点必须包含有限数量的记录，以便将其存储在一个磁盘块中，因此在将新记录插入到完整节点时使用节点分裂算法。节点分割算法是影响r树查询性能的关键因素之一，因为糟糕的分割会导致r树结构效率低下。首先，我们给出了一种高效的线性节点分裂算法，该算法通过根据节点的MBR形状和节点记录的MBR形状划分节点的最小边界矩形，可以快速构建r树。然后我们发现这种节点分割算法有时会产生不均匀的节点，即待分割的节点可能被分割成两个节点，其中一个节点的记录数少于所需的最小记录数。然后，我们开发了一种算法来平衡这些不均匀的分裂结果，以满足r树定义的要求。最后，通过考虑分裂节点的兄弟节点，改进了节点分裂算法。在选择插入记录的节点阶段，选取分裂节点的兄弟节点，并将其与分裂节点放在一起，以产生更好的结果。我们执行了几个实验，将我们的节点分裂算法与其他一些节点分裂算法在合成和真实世界数据上进行比较。这些实验测试了树的构建成本和生成的r树的查询性能。结果表明，我们的算法的树构造成本比其他算法低，尽管平衡过程降低了性能，但在查询生成的r树时，我们的算法优于其他节点分割算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 17th International Conference on Geoinformatics

自引率

0.00%

发文量