{"title":"使用扩展整数和哈希映射加速大数据分析应用的分布式四叉树的构建","authors":"Mayumbo Nyirenda, David Zulu","doi":"10.1109/ISNCC.2017.8072032","DOIUrl":null,"url":null,"abstract":"Fast access through retrieval and insertion of data is critical to spatial big data analytics applications. This access is however one of the bottlenecks in large-scale spatial data-centric applications. Distributed spatial indexing structures such as quadtrees have been proposed to help alleviate this bottleneck. Some of the proposed solutions use a static sample of the data to build a quadtree as a directory structure for locating distributed data servers. In this paper, we take into account the process of query redirection during the construction of the distributed quadtree as well as query redirection during a data retrieval process. We propose taking advantage of the static nature of the sample of the data and the use of hashmaps and dilated integers to speed up traversal of the directory. We conduct experiments for construction and data querying and show that both construction and querying performance improves threefold when you compare the new approach to the previously proposed approach. In addition further experiments show that the proposed new approach is much less sensitive to data skewness. Overall our results show that use of dilated integers coupled with hashmaps can improve the performance of distributed spatial indexing structures used to help alleviate the data access bottleneck in big data spatial analytics.","PeriodicalId":176998,"journal":{"name":"2017 International Symposium on Networks, Computers and Communications (ISNCC)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Speeding up construction of distributed quadtrees for big-data analytics applications using dilated integers and hashmaps\",\"authors\":\"Mayumbo Nyirenda, David Zulu\",\"doi\":\"10.1109/ISNCC.2017.8072032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fast access through retrieval and insertion of data is critical to spatial big data analytics applications. This access is however one of the bottlenecks in large-scale spatial data-centric applications. Distributed spatial indexing structures such as quadtrees have been proposed to help alleviate this bottleneck. Some of the proposed solutions use a static sample of the data to build a quadtree as a directory structure for locating distributed data servers. In this paper, we take into account the process of query redirection during the construction of the distributed quadtree as well as query redirection during a data retrieval process. We propose taking advantage of the static nature of the sample of the data and the use of hashmaps and dilated integers to speed up traversal of the directory. We conduct experiments for construction and data querying and show that both construction and querying performance improves threefold when you compare the new approach to the previously proposed approach. In addition further experiments show that the proposed new approach is much less sensitive to data skewness. Overall our results show that use of dilated integers coupled with hashmaps can improve the performance of distributed spatial indexing structures used to help alleviate the data access bottleneck in big data spatial analytics.\",\"PeriodicalId\":176998,\"journal\":{\"name\":\"2017 International Symposium on Networks, Computers and Communications (ISNCC)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Symposium on Networks, Computers and Communications (ISNCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISNCC.2017.8072032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Symposium on Networks, Computers and Communications (ISNCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISNCC.2017.8072032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speeding up construction of distributed quadtrees for big-data analytics applications using dilated integers and hashmaps
Fast access through retrieval and insertion of data is critical to spatial big data analytics applications. This access is however one of the bottlenecks in large-scale spatial data-centric applications. Distributed spatial indexing structures such as quadtrees have been proposed to help alleviate this bottleneck. Some of the proposed solutions use a static sample of the data to build a quadtree as a directory structure for locating distributed data servers. In this paper, we take into account the process of query redirection during the construction of the distributed quadtree as well as query redirection during a data retrieval process. We propose taking advantage of the static nature of the sample of the data and the use of hashmaps and dilated integers to speed up traversal of the directory. We conduct experiments for construction and data querying and show that both construction and querying performance improves threefold when you compare the new approach to the previously proposed approach. In addition further experiments show that the proposed new approach is much less sensitive to data skewness. Overall our results show that use of dilated integers coupled with hashmaps can improve the performance of distributed spatial indexing structures used to help alleviate the data access bottleneck in big data spatial analytics.