{"title":"Building a Large-Scale Microscopic Road Network Traffic Simulator in Apache Spark","authors":"Zishan Fu, Jia Yu, Mohamed Sarwat","doi":"10.1109/MDM.2019.00-42","DOIUrl":"https://doi.org/10.1109/MDM.2019.00-42","url":null,"abstract":"Road network traffic data has been widely studied by researchers and practitioners in different areas such as urban planning, traffic prediction, and spatial-temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install GPS receivers and administrators must continuously monitor these devices. There has been a number of urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. In the paper, we propose GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize largescale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 200 thousand vehicles over a very large road network (250 thousand road junctions and 300 thousand road segments).","PeriodicalId":241426,"journal":{"name":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115799918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Message from the Transport & Urban Analytics Track Chairs","authors":"","doi":"10.1109/mdm.2019.00-95","DOIUrl":"https://doi.org/10.1109/mdm.2019.00-95","url":null,"abstract":"","PeriodicalId":241426,"journal":{"name":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124127680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining Prevalent Co-Location Patterns Based on Global Topological Relations","authors":"Jialong Wang, Lizhen Wang, Xiaoxuan Wang","doi":"10.1109/MDM.2019.00-55","DOIUrl":"https://doi.org/10.1109/MDM.2019.00-55","url":null,"abstract":"Spatial co-location pattern mining is an important branch in the spatial data mining area, which discovers subsets of spatial features whose instances are frequently located together in the geographic space. The proximity between instances is defined by a distance threshold given by the user in traditional spatial co-location pattern mining. However, the user doesn't know which distance threshold is appropriate in most cases, even for experts. Besides, different densities of instance distribution are not considered in a dataset when using a unified distance threshold to measure the proximity. Also, global topological relations of instances are ignored in mining. In this paper, we consider the global topological relations by constructing Delaunay triangulation of spatial instances and calculate a distance constraint for each instance based on the constructed Delaunay triangulation. We redefine the proximity of instances according to the distance constraint so that users don't have to worry about giving an appropriate distance threshold when mining prevalent co-location patterns. We propose a new algorithm PTB based on a proximity relationship tree P-tree which stores the proximity relationships between instances. The experimental evaluation of several real-world datasets shows that our algorithm can get better results. We also evaluate each parameter and the number of features and instances affecting the efficiency of the algorithm by using synthetic datasets.","PeriodicalId":241426,"journal":{"name":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127298132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Android App Update Timing: A Measurement Study","authors":"Ed Novak, Chris Marchini","doi":"10.1109/MDM.2019.00118","DOIUrl":"https://doi.org/10.1109/MDM.2019.00118","url":null,"abstract":"Over the past decade the pace of software publication has increased dramatically. Thanks to the advent of \"app markets\" software distribution on mobile devices is centralized and software updates are, by default, fully automated. In this work we study the pace of software updates on Android smart mobile devices. Specifically, we measure the rate at which users experience software updates, and the delay between the time an app update is made available, and when it is actually installed. Our data shows that, of the top 12 most popular apps in our dataset, over 10 of them are updated more frequently than once every two weeks. On average users install an update for at least one app every 56hrs.","PeriodicalId":241426,"journal":{"name":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128907381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Popularity Prediction Caching Using Hidden Markov Model for Vehicular Content Centric Networks","authors":"Lin Yao, Yuqi Wang, Qiufen Xia, Rui Xu","doi":"10.1109/MDM.2019.00115","DOIUrl":"https://doi.org/10.1109/MDM.2019.00115","url":null,"abstract":"Vehicular Content Centric Network (VCCN) is proposed to cope with mobility and intermittent connectivity issues of vehicular ad hoc networks by enabling the Content Centric Network (CCN) model in vehicular networks. The ubiquitous in-network caching of VCCN allows nodes to cache contents frequently accessed data items, improving the hit ratio of content retrieval and reducing the data access delay. Furthermore, it can significantly mitigate bandwidth pressure. Therefore, it is crucial to cache more popular contents at various caching nodes. In this paper, we propose a novel cache replacement scheme named Popularity-based Content Caching (PopCC), which incorporates the future popularity of contents into our decision making. We adopt Hidden Markov Model (HMM) to predict the content popularity based on the inherent characters of the received interests, request ratio, request frequency and content priority. To evaluate the performance of our proposed scheme PopCC, we compare it with some state-of-the-art schemes in terms of cache hit, average access delay, average hop count and average storage usage. Simulations demonstrate that the proposed scheme possesses a better performance.","PeriodicalId":241426,"journal":{"name":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132383446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Clustering Noisy Trajectories via Robust Deep Attention Auto-Encoders","authors":"Rui Zhang, Peng Xie, Hongbo Jiang, Zhu Xiao, Chen Wang, Ling Liu","doi":"10.1109/MDM.2019.00-73","DOIUrl":"https://doi.org/10.1109/MDM.2019.00-73","url":null,"abstract":"Trajectory clustering aims at grouping similar trajectories into one cluster. It is an efficient way of finding the representative path or common trend shared by different moving objects, and also provides a foundation for movement pattern mining, anomaly detection and other applications. Existing trajectory clustering studies mainly rely on feature selection and similarity measurement based on their geographical and spatial properties. However, one obstacle hindering their wide usage is the problem of clustering accuracy in the presence of noisy or incomplete sensing data, due to limited sensory device quantity, communication errors, sensor failures, and sensor vacancy. This paper proposes an error-tolerant trajectory clustering approach by incorporating denoising methods.We propose the Robust Deep Attention Auto-encoders model (called Robust DAA) to learn the representations of low-dimensional denoising trajectories with three novel features. First, we present the deep attention auto-encoders by integrating the attention mechanism into the classical deep auto-encoder, which is capable of enhancing feature propagation and feature selection. Second, we train the deep attention auto-encoder by applying proximal method, back propagation and the Alternating Direction of Method of Multipliers (ADMM). As a result, our Robust DAA can reduce the negative influence of the noise on trajectory data. Finally, we perform clustering over the low-dimensional denoising representations using traditional clustering algorithms and demonstrates the quality of the clustering results by comparing our approach with existing representative methods. Extensive experiments are conducted on both synthetic datasets and real datasets. The results show that our approach outperforms the existing models in terms of accuracy, precision, recall and f1-score.","PeriodicalId":241426,"journal":{"name":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130235078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Bis, Noah Bix, Benjamin Gruman, S. Guenette, Adam Hauge, Hannah Moser, Jimmy Paul, Goce Trajcevski
{"title":"LaCAVR: Load and Constraints Aware Vehicle Rerouting","authors":"D. Bis, Noah Bix, Benjamin Gruman, S. Guenette, Adam Hauge, Hannah Moser, Jimmy Paul, Goce Trajcevski","doi":"10.1109/MDM.2019.00-32","DOIUrl":"https://doi.org/10.1109/MDM.2019.00-32","url":null,"abstract":"We present a prototype system for effective management of a delivery fleet in the settings in which the traffic abnormalities may necessitate rerouting of (some of) the trucks. Unforeseen congestions (e.g., due to accidents) may affect the average speed along road segments that were used to calculate the routes of a particular truck. Complementary to the traditional (re) routing approaches where the main objective is to find the new shortest route to the same destination but under the changed traffic circumstances, we incorporate two additional constraints. Namely, we aim at striking a balance between minimizing the additional expenses due to drivers overtime pay and maximizing the delivery of the goods still available on the truck's load, possibly by changing the original destinations. The project is developed with an actual industry partner with main business of managing supplies for office pantries, kitchens and cafés.","PeriodicalId":241426,"journal":{"name":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128024941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Offline Worker Selection for Real-Time Spatial Crowdsourcing Multi-Worker Tasks","authors":"Yongjian Zhao, Qi Han","doi":"10.1109/MDM.2019.00117","DOIUrl":"https://doi.org/10.1109/MDM.2019.00117","url":null,"abstract":"Spatial crowdsourcing consists of location-specific tasks that require people to be physically at specific locations to complete them. In this paper we focus on worker selection for spatial crowdsourcing where each task requires multiple workers to accomplish. We mathematically formulate the problem and prove its APX-hardness. We develop efficient greedy algorithms with a good approximation ratio. Compared with state-of-the art approach, our proposed algorithm outperforms by 35%.","PeriodicalId":241426,"journal":{"name":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131250087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chrysovalantis Anastasiou, Chao Huang, S. H. Kim, C. Shahabi
{"title":"Time-Dependent Reachability Analysis: A Data-Driven Approach","authors":"Chrysovalantis Anastasiou, Chao Huang, S. H. Kim, C. Shahabi","doi":"10.1109/MDM.2019.00-64","DOIUrl":"https://doi.org/10.1109/MDM.2019.00-64","url":null,"abstract":"An isochrone is generally defined as a curve drawn on a map connecting points at which moving objects (e.g., cars) arrive at the same time. Their construction is an important task in many application domains. As an example, in urban planning, isochrones are essential when assessing the placement of public services like hospitals and fire departments. In this study, we formally define the isochrone and reverse isochrone problems, describe our approach to solving them and provide a fully functional system that is capable of visualizing the reachability in various ways. Unlike other studies, our approach is purely data-driven and does not depend on the underlying road network for computing the isochrone. Instead, we focus on directly processing trajectory data. Our system processes two real-world taxi datasets to visualize the reachability of the cities of Seoul and Xi'an. As our experiments show, our approach outperforms the traditional graph-theory techniques while eliminating the expensive need of preprocessing the data.","PeriodicalId":241426,"journal":{"name":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129193595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Trajectory Prediction from a Mass of Sparse and Missing External Sensor Data","authors":"L. A. Cruz, K. Zeitouni, J. Macêdo","doi":"10.1109/MDM.2019.00-43","DOIUrl":"https://doi.org/10.1109/MDM.2019.00-43","url":null,"abstract":"In this paper, we predict the movement of objects under the circumstance where external sensors placed on the road-sides (e.g., traffic surveillance cameras) capture their trajectories. This type of trajectories may have very different mobility patterns since they are not restricted to a fleet or a community of users. However, their reported positions are sparse due to the sparsity of the sensor distribution, and incomplete, since the sensors may fail to register the passage of objects. In this paper, we first analyze such external sensor trajectories based on a real dataset, which evidenced the problems of their sparsity and their incompleteness, and hinders the location prediction. In this context, we proposed an approach for coping with the missing data problem. We discussed how to apply this approach in conjunction with the predictors based on Recurrent Neural Networks. In particular, we adjusted the accuracy metrics to account for missing values in the test set, by introducing the distance between the predicted location and the registered next location. We evaluate our approach compared to the baselines, showing an improvement of about 23% in the prediction accuracy while reducing the overall distances. In spite of the contribution of many works in location prediction, at the best of our knowledge, none of those works have studied location prediction for trajectories based on external (road-side) sensors data.","PeriodicalId":241426,"journal":{"name":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116050313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}