4DHI:在键值数据库中对遥感图像进行近似kNN搜索的索引

C. Hewage, A. Vo, Nhien-An Le-Khac, D. Laefer, M. Bertolotto
{"title":"4DHI:在键值数据库中对遥感图像进行近似kNN搜索的索引","authors":"C. Hewage, A. Vo, Nhien-An Le-Khac, D. Laefer, M. Bertolotto","doi":"10.1109/IC2E55432.2022.00025","DOIUrl":null,"url":null,"abstract":"State-of-the-art, scalable, indexing techniques in location-based image data retrieval are primarily focused on supporting window and range queries. However, support of these indexes is not well explored when there are multiple spatially similar images to retrieve for a given geographic location. Adoption of existing spatial indexes such as the kD-tree pose major scalability impediments. In response, this work proposes a novel scalable, key-value, database oriented, secondary-memory based, spatial index to retrieve the top $k$ most spatially similar images to a given geographic location. The proposed index introduces a 4-dimensional Hilbert index (4DHI). This space filling curve is implemented atop HBase (a key-value database). Experiments performed on both synthetically generated and real world data demonstrate comparable accuracy with MD-HBase (a state of the art, scalable, multidimensional point data management system) and better performance. Specifically, 4DHI yielded 34% - 39% storage improvements compared to the disk consumption of the original index of MD-HBase. The compactness in 4DHI also yielded up to 3.4 and 4.7 fold gains when retrieving 6400 and 12800 neighbours, respectively; compared to the adoption of original index of MD-HBase for respective neighbour searches. An optimization technique termed “Bounding Box Displacement” (BBD) is introduced to improve the accuracy of the top $k$ approximations in relation to the results of in-memory kD-tree. Finally, a method of reducing row key length is also discussed for the proposed 4DHI to further improve the storage efficiency and scalability in managing large numbers of remotely sensed images.","PeriodicalId":415781,"journal":{"name":"2022 IEEE International Conference on Cloud Engineering (IC2E)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"4DHI: An index for approximate kNN search of remotely sensed images in Key-Value databases\",\"authors\":\"C. Hewage, A. Vo, Nhien-An Le-Khac, D. Laefer, M. Bertolotto\",\"doi\":\"10.1109/IC2E55432.2022.00025\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"State-of-the-art, scalable, indexing techniques in location-based image data retrieval are primarily focused on supporting window and range queries. However, support of these indexes is not well explored when there are multiple spatially similar images to retrieve for a given geographic location. Adoption of existing spatial indexes such as the kD-tree pose major scalability impediments. In response, this work proposes a novel scalable, key-value, database oriented, secondary-memory based, spatial index to retrieve the top $k$ most spatially similar images to a given geographic location. The proposed index introduces a 4-dimensional Hilbert index (4DHI). This space filling curve is implemented atop HBase (a key-value database). Experiments performed on both synthetically generated and real world data demonstrate comparable accuracy with MD-HBase (a state of the art, scalable, multidimensional point data management system) and better performance. Specifically, 4DHI yielded 34% - 39% storage improvements compared to the disk consumption of the original index of MD-HBase. The compactness in 4DHI also yielded up to 3.4 and 4.7 fold gains when retrieving 6400 and 12800 neighbours, respectively; compared to the adoption of original index of MD-HBase for respective neighbour searches. An optimization technique termed “Bounding Box Displacement” (BBD) is introduced to improve the accuracy of the top $k$ approximations in relation to the results of in-memory kD-tree. Finally, a method of reducing row key length is also discussed for the proposed 4DHI to further improve the storage efficiency and scalability in managing large numbers of remotely sensed images.\",\"PeriodicalId\":415781,\"journal\":{\"name\":\"2022 IEEE International Conference on Cloud Engineering (IC2E)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Cloud Engineering (IC2E)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC2E55432.2022.00025\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Cloud Engineering (IC2E)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC2E55432.2022.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在基于位置的图像数据检索中,最先进的、可扩展的索引技术主要集中在支持窗口和范围查询。然而,当一个给定的地理位置有多个空间相似的图像需要检索时,这些索引的支持并没有得到很好的探索。采用现有的空间索引(如kD-tree)会对可伸缩性造成很大的障碍。作为回应,这项工作提出了一种新的可扩展的、键值的、面向数据库的、基于二级存储器的空间索引,用于检索给定地理位置的前k个空间上最相似的图像。该指数引入了一个四维希尔伯特指数(4DHI)。这个空间填充曲线是在HBase(一个键值数据库)之上实现的。在合成生成的数据和真实世界的数据上进行的实验表明,MD-HBase(最先进的、可扩展的多维点数据管理系统)具有相当的准确性和更好的性能。具体来说,与MD-HBase的原始索引的磁盘消耗相比,4DHI产生了34% - 39%的存储改进。当检索6400个和12800个邻居时,4DHI的紧凑性也分别产生了3.4倍和4.7倍的增益;相对于采用MD-HBase的原始索引进行各自的邻居搜索。引入了一种称为“边界盒位移”(Bounding Box Displacement, BBD)的优化技术,以提高top $k$近似与内存中kD-tree结果的准确性。最后,本文还讨论了减少行键长度的方法,以进一步提高存储效率和管理大量遥感图像的可扩展性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
4DHI: An index for approximate kNN search of remotely sensed images in Key-Value databases
State-of-the-art, scalable, indexing techniques in location-based image data retrieval are primarily focused on supporting window and range queries. However, support of these indexes is not well explored when there are multiple spatially similar images to retrieve for a given geographic location. Adoption of existing spatial indexes such as the kD-tree pose major scalability impediments. In response, this work proposes a novel scalable, key-value, database oriented, secondary-memory based, spatial index to retrieve the top $k$ most spatially similar images to a given geographic location. The proposed index introduces a 4-dimensional Hilbert index (4DHI). This space filling curve is implemented atop HBase (a key-value database). Experiments performed on both synthetically generated and real world data demonstrate comparable accuracy with MD-HBase (a state of the art, scalable, multidimensional point data management system) and better performance. Specifically, 4DHI yielded 34% - 39% storage improvements compared to the disk consumption of the original index of MD-HBase. The compactness in 4DHI also yielded up to 3.4 and 4.7 fold gains when retrieving 6400 and 12800 neighbours, respectively; compared to the adoption of original index of MD-HBase for respective neighbour searches. An optimization technique termed “Bounding Box Displacement” (BBD) is introduced to improve the accuracy of the top $k$ approximations in relation to the results of in-memory kD-tree. Finally, a method of reducing row key length is also discussed for the proposed 4DHI to further improve the storage efficiency and scalability in managing large numbers of remotely sensed images.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信