MR-Cubes: On-the-Fly Computation of Location Popularity from Check-in Data Streams

G. Constantinou, Chrysovalantis Anastasiou, Dimitris Stripelis, C. Shahabi
{"title":"MR-Cubes: On-the-Fly Computation of Location Popularity from Check-in Data Streams","authors":"G. Constantinou, Chrysovalantis Anastasiou, Dimitris Stripelis, C. Shahabi","doi":"10.1109/MDM.2019.00-77","DOIUrl":null,"url":null,"abstract":"Several applications in urban planning, ride-sharing or marketing, require access to the location popularity of a geographical area (e.g., city block, city, county) in near real-time and at different resolutions. To conceptualize such an access, imagine a visualization tool to view a heatmap of location popularity of a region on-the-fly as a user interacts seamlessly by zooming in and out. The access method required to enable such a seamless visualization must support: 1) updating the heatmap cells frequently as the raw data (e.g., check-ins) arrives at a high rate in a streaming fashion, and 2) splitting and merging the adjacent cells quickly to support zooming in and out, respectively. This is challenging because the most useful metric for location popularity, location entropy, requires counting the number of unique visits per user, and hence: 1) a large data structure should be maintained and updated per cell, and 2) the adjacent cells must be aggregated/disaggregated quickly while the unique visits are not additive. Due to these challenges, the previous techniques for OLAP cubes, streaming sketches and index structures are not effective. In this paper, we propose a new index structure called MR-Cube that approximates the popularity by maintaining sketches of streamed data per cell, supports time-decay for older visits and aggregates the non-additive location popularity quickly and accurately at different resolutions. We evaluate the accuracy and efficiency of MR-Cube using real-world and synthetic datasets and show its utility for our application.","PeriodicalId":241426,"journal":{"name":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MDM.2019.00-77","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Several applications in urban planning, ride-sharing or marketing, require access to the location popularity of a geographical area (e.g., city block, city, county) in near real-time and at different resolutions. To conceptualize such an access, imagine a visualization tool to view a heatmap of location popularity of a region on-the-fly as a user interacts seamlessly by zooming in and out. The access method required to enable such a seamless visualization must support: 1) updating the heatmap cells frequently as the raw data (e.g., check-ins) arrives at a high rate in a streaming fashion, and 2) splitting and merging the adjacent cells quickly to support zooming in and out, respectively. This is challenging because the most useful metric for location popularity, location entropy, requires counting the number of unique visits per user, and hence: 1) a large data structure should be maintained and updated per cell, and 2) the adjacent cells must be aggregated/disaggregated quickly while the unique visits are not additive. Due to these challenges, the previous techniques for OLAP cubes, streaming sketches and index structures are not effective. In this paper, we propose a new index structure called MR-Cube that approximates the popularity by maintaining sketches of streamed data per cell, supports time-decay for older visits and aggregates the non-additive location popularity quickly and accurately at different resolutions. We evaluate the accuracy and efficiency of MR-Cube using real-world and synthetic datasets and show its utility for our application.
MR-Cubes:从签到数据流中实时计算位置人气
在城市规划、拼车或市场营销中的一些应用,需要以近乎实时和不同的分辨率访问地理区域(例如,城市街区、城市、县)的位置人气。为了概念化这样的访问,想象一个可视化工具,当用户通过放大和缩小无缝交互时,它可以实时查看一个地区的位置流行度热图。实现这种无缝可视化所需的访问方法必须支持:1)经常更新热图单元格,因为原始数据(例如,签入)以流方式以高速率到达;2)快速拆分和合并相邻单元格,以分别支持放大和缩小。这是具有挑战性的,因为位置人气最有用的度量,即位置熵,需要计算每个用户的唯一访问次数,因此:1)应该维护和更新每个单元的大型数据结构,2)相邻单元必须快速聚合/分解,而唯一访问不能相加。由于这些挑战,以前用于OLAP多维数据集、流草图和索引结构的技术并不有效。在本文中,我们提出了一种名为MR-Cube的新索引结构,它通过维护每个单元的流数据草图来近似流行度,支持时间衰减,并在不同分辨率下快速准确地聚合非加性位置流行度。我们使用真实世界和合成数据集评估MR-Cube的准确性和效率,并展示其在我们的应用程序中的实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信