Maximizing Range Sum in External Memory

IF 2.2 2区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Dong-Wan Choi, C. Chung, Yufei Tao
{"title":"Maximizing Range Sum in External Memory","authors":"Dong-Wan Choi, C. Chung, Yufei Tao","doi":"10.1145/2629477","DOIUrl":null,"url":null,"abstract":"This article studies the MaxRS problem in spatial databases. Given a set O of weighted points and a rectangle r of a given size, the goal of the MaxRS problem is to find a location of r such that the sum of the weights of all the points covered by r is maximized. This problem is useful in many location-based services such as finding the best place for a new franchise store with a limited delivery range and finding the hotspot with the largest number of nearby attractions for a tourist with a limited reachable range. However, the problem has been studied mainly in the theoretical perspective, particularly in computational geometry. The existing algorithms from the computational geometry community are in-memory algorithms that do not guarantee the scalability. In this article, we propose a scalable external-memory algorithm (ExactMaxRS) for the MaxRS problem that is optimal in terms of the I/O complexity. In addition, we propose an approximation algorithm (ApproxMaxCRS) for the MaxCRS problem that is a circle version of the MaxRS problem. We prove the correctness and optimality of the ExactMaxRS algorithm along with the approximation bound of the ApproxMaxCRS algorithm.\n Furthermore, motivated by the fact that all the existing solutions simply assume that there is no tied area for the best location, we extend the MaxRS problem to a more fundamental problem, namely AllMaxRS, so that all the locations with the same best score can be retrieved. We first prove that the AllMaxRS problem cannot be trivially solved by applying the techniques for the MaxRS problem. Then we propose an output-sensitive external-memory algorithm (TwoPhaseMaxRS) that gives the exact solution for the AllMaxRS problem through two phases. Also, we prove both the soundness and completeness of the result returned from TwoPhaseMaxRS.\n From extensive experimental results, we show that ExactMaxRS and ApproxMaxCRS are several orders of magnitude faster than methods adapted from existing algorithms, the approximation bound in practice is much better than the theoretical bound of ApproxMaxCRS, and TwoPhaseMaxRS is not only much faster but also more robust than the straightforward extension of ExactMaxRS.","PeriodicalId":50915,"journal":{"name":"ACM Transactions on Database Systems","volume":"68 1","pages":"21:1-21:44"},"PeriodicalIF":2.2000,"publicationDate":"2014-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Database Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/2629477","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 25

Abstract

This article studies the MaxRS problem in spatial databases. Given a set O of weighted points and a rectangle r of a given size, the goal of the MaxRS problem is to find a location of r such that the sum of the weights of all the points covered by r is maximized. This problem is useful in many location-based services such as finding the best place for a new franchise store with a limited delivery range and finding the hotspot with the largest number of nearby attractions for a tourist with a limited reachable range. However, the problem has been studied mainly in the theoretical perspective, particularly in computational geometry. The existing algorithms from the computational geometry community are in-memory algorithms that do not guarantee the scalability. In this article, we propose a scalable external-memory algorithm (ExactMaxRS) for the MaxRS problem that is optimal in terms of the I/O complexity. In addition, we propose an approximation algorithm (ApproxMaxCRS) for the MaxCRS problem that is a circle version of the MaxRS problem. We prove the correctness and optimality of the ExactMaxRS algorithm along with the approximation bound of the ApproxMaxCRS algorithm. Furthermore, motivated by the fact that all the existing solutions simply assume that there is no tied area for the best location, we extend the MaxRS problem to a more fundamental problem, namely AllMaxRS, so that all the locations with the same best score can be retrieved. We first prove that the AllMaxRS problem cannot be trivially solved by applying the techniques for the MaxRS problem. Then we propose an output-sensitive external-memory algorithm (TwoPhaseMaxRS) that gives the exact solution for the AllMaxRS problem through two phases. Also, we prove both the soundness and completeness of the result returned from TwoPhaseMaxRS. From extensive experimental results, we show that ExactMaxRS and ApproxMaxCRS are several orders of magnitude faster than methods adapted from existing algorithms, the approximation bound in practice is much better than the theoretical bound of ApproxMaxCRS, and TwoPhaseMaxRS is not only much faster but also more robust than the straightforward extension of ExactMaxRS.
在外部存储器中最大化范围总和
本文研究了空间数据库中的MaxRS问题。给定一组O个加权点和一个给定大小的矩形r, MaxRS问题的目标是找到一个r的位置,使得r所覆盖的所有点的权重总和最大化。这个问题在许多基于位置的服务中都很有用,比如为一个配送范围有限的新加盟店找到最佳地点,为一个可到达范围有限的游客找到附近景点最多的热点。然而,这个问题主要是从理论的角度,特别是从计算几何的角度来研究的。来自计算几何社区的现有算法是内存中的算法,不能保证可伸缩性。在本文中,我们针对MaxRS问题提出了一种可扩展的外部内存算法(ExactMaxRS),该算法在I/O复杂度方面是最佳的。此外,我们为MaxCRS问题提出了一种近似算法(ApproxMaxCRS),该算法是MaxRS问题的圆形版本。我们证明了ExactMaxRS算法的正确性和最优性,并给出了该算法的近似界。此外,由于所有现有的解决方案都简单地假设最佳位置没有固定的区域,因此我们将MaxRS问题扩展到一个更基本的问题,即AllMaxRS,以便可以检索到具有相同最佳分数的所有位置。我们首先用MaxRS问题的技术证明了AllMaxRS问题不能简单地求解。然后,我们提出了一种输出敏感的外部存储器算法(TwoPhaseMaxRS),该算法通过两个相位给出了AllMaxRS问题的精确解。此外,我们还证明了从TwoPhaseMaxRS返回的结果的健全性和完整性。从大量的实验结果中,我们发现ExactMaxRS和ApproxMaxCRS比现有算法的方法快了几个数量级,实践中的近似界比ApproxMaxCRS的理论界要好得多,而TwoPhaseMaxRS不仅比ExactMaxRS的直接扩展快得多,而且更健壮。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ACM Transactions on Database Systems
ACM Transactions on Database Systems 工程技术-计算机:软件工程
CiteScore
5.60
自引率
0.00%
发文量
15
审稿时长
>12 weeks
期刊介绍: Heavily used in both academic and corporate R&D settings, ACM Transactions on Database Systems (TODS) is a key publication for computer scientists working in data abstraction, data modeling, and designing data management systems. Topics include storage and retrieval, transaction management, distributed and federated databases, semantics of data, intelligent databases, and operations and algorithms relating to these areas. In this rapidly changing field, TODS provides insights into the thoughts of the best minds in database R&D.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信