Comparative Evaluation of Various Indexing Techniques of Geospatial Vector Data for Processing in Distributed Computing Environment

Abdul Jhummarwala, Mazin Alkathiri, Miren Karamta, M. Potdar
{"title":"Comparative Evaluation of Various Indexing Techniques of Geospatial Vector Data for Processing in Distributed Computing Environment","authors":"Abdul Jhummarwala, Mazin Alkathiri, Miren Karamta, M. Potdar","doi":"10.1145/2998476.2998493","DOIUrl":null,"url":null,"abstract":"The explosion of ever increasing geospatial data is today met with the challenge of maintaining it in spatial databases and utilization of traditional methods of spatial data processing. The sheer volume and complexity of spatial databases makes them an ideal candidate for use with parallel and distributed processing architectures. There is a lot of enthusiasm toward using MapReduce paradigm and distributed computing for processing of large volumes of vector data. As spatial data cannot be indexed using traditional B-tree structures used by R/DBMS, several libraries such as JSI (Java Spatial Index), libspatialindex and SpatiaLite depend upon advanced data structures such as R/R*-tree, Quad-tree and their variants for spatial indexing. These indexing mechanisms have also been natively incorporated in frameworks such as Spatial Hadoop, Hadoop GIS SATO and GeoSpark. Additionally, most widely used open source RDBMS such as MySQL, Postgres and SQLite incorporate spatial indexing using extensions/add-ons. In this paper, we benchmark and compare the performance of various spatial indexing mechanisms in addition to evaluating the performance of distributed frameworks for planet sized datasets. We conclude by highlighting the characteristics of spatial tools and frameworks for better selection and implementation of R-tree indexing in a big geo-spatial processing system.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"112 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th Annual ACM India Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2998476.2998493","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

The explosion of ever increasing geospatial data is today met with the challenge of maintaining it in spatial databases and utilization of traditional methods of spatial data processing. The sheer volume and complexity of spatial databases makes them an ideal candidate for use with parallel and distributed processing architectures. There is a lot of enthusiasm toward using MapReduce paradigm and distributed computing for processing of large volumes of vector data. As spatial data cannot be indexed using traditional B-tree structures used by R/DBMS, several libraries such as JSI (Java Spatial Index), libspatialindex and SpatiaLite depend upon advanced data structures such as R/R*-tree, Quad-tree and their variants for spatial indexing. These indexing mechanisms have also been natively incorporated in frameworks such as Spatial Hadoop, Hadoop GIS SATO and GeoSpark. Additionally, most widely used open source RDBMS such as MySQL, Postgres and SQLite incorporate spatial indexing using extensions/add-ons. In this paper, we benchmark and compare the performance of various spatial indexing mechanisms in addition to evaluating the performance of distributed frameworks for planet sized datasets. We conclude by highlighting the characteristics of spatial tools and frameworks for better selection and implementation of R-tree indexing in a big geo-spatial processing system.
分布式计算环境下各种地理空间矢量数据索引处理技术的比较评价
随着地理空间数据的爆炸式增长,在空间数据库中维护地理空间数据和利用传统的空间数据处理方法面临着挑战。空间数据库的庞大容量和复杂性使其成为并行和分布式处理体系结构的理想选择。对于使用MapReduce范式和分布式计算来处理大量向量数据,人们有着很大的热情。由于空间数据不能使用R/DBMS使用的传统b树结构进行索引,一些库,如JSI (Java空间索引),libspatialindex和SpatiaLite依赖于高级数据结构,如R/R*-tree,四叉树及其变体来进行空间索引。这些索引机制也已经被整合到诸如Spatial Hadoop、Hadoop GIS SATO和GeoSpark等框架中。此外,大多数广泛使用的开源RDBMS(如MySQL、Postgres和SQLite)都使用扩展/附加组件合并了空间索引。在本文中,除了评估分布式框架在行星大小数据集上的性能外,我们还对各种空间索引机制的性能进行了基准测试和比较。最后,我们强调了空间工具和框架的特点,以便在大型地理空间处理系统中更好地选择和实施r树索引。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信