Kun Zheng, M. Kwan, Falin Fang, Junjun Yin, D. Gu, Yanli Fu
{"title":"A Topology-concerned Spatial Vector Data Model for Column-oriented Databases","authors":"Kun Zheng, M. Kwan, Falin Fang, Junjun Yin, D. Gu, Yanli Fu","doi":"10.14257/IJDTA.2017.10.5.04","DOIUrl":null,"url":null,"abstract":"In today’s “Big Data” era, the volume of spatial data grows rapidly. Addressing the challenges in efficient spatial Big Data storage and management becomes urgent. However, conventional row-based spatial databases have many limitations, such a slow data I/O efficiency, low data retrieval performance, poor scalability, and high maintenance costs. These conventional spatial databases are no longer suitable for today’s spatial Big Data. On the other hand, column-oriented databases have several superior features, such as high reliability, scalability and fault tolerance. More importantly, they have better I/O efficiency for query processing. This paper presents a topology-concerned spatial vector data model for column-oriented databases and designed the physical storage model, which is a unified model for storing and managing information of geometry, attribute and topology of spatial objects. For the storage characteristics of column-oriented databases, the model designed a new Rowkey encoding schema with the Z-order filling curve approach. This encoding schema of Rowkey considering spatial proximity optimizes the organizational structure of spatial data models. It means nearby spatial objects are also closer to each other in the physical storage, which can further improve the efficiency of spatial data storage and enable spatial query capability in column-oriented databases. Three experiments were conducted including data storing, range query and K-NN query to analyze the efficiency and spatial query capability of the data model. The results of the experiments show that the data model has good scalability and efficiency on the vector data storage and spatial query. It is suitable for large-scale spatial vector data storage and management in column-oriented databases.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"66 1","pages":"33-46"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of database theory and application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14257/IJDTA.2017.10.5.04","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In today’s “Big Data” era, the volume of spatial data grows rapidly. Addressing the challenges in efficient spatial Big Data storage and management becomes urgent. However, conventional row-based spatial databases have many limitations, such a slow data I/O efficiency, low data retrieval performance, poor scalability, and high maintenance costs. These conventional spatial databases are no longer suitable for today’s spatial Big Data. On the other hand, column-oriented databases have several superior features, such as high reliability, scalability and fault tolerance. More importantly, they have better I/O efficiency for query processing. This paper presents a topology-concerned spatial vector data model for column-oriented databases and designed the physical storage model, which is a unified model for storing and managing information of geometry, attribute and topology of spatial objects. For the storage characteristics of column-oriented databases, the model designed a new Rowkey encoding schema with the Z-order filling curve approach. This encoding schema of Rowkey considering spatial proximity optimizes the organizational structure of spatial data models. It means nearby spatial objects are also closer to each other in the physical storage, which can further improve the efficiency of spatial data storage and enable spatial query capability in column-oriented databases. Three experiments were conducted including data storing, range query and K-NN query to analyze the efficiency and spatial query capability of the data model. The results of the experiments show that the data model has good scalability and efficiency on the vector data storage and spatial query. It is suitable for large-scale spatial vector data storage and management in column-oriented databases.