{"title":"Linking Multiple User Identities of Multiple Services from Massive Mobility Traces","authors":"Huandong Wang, Yong Li, Gang Wang, Depeng Jin","doi":"10.1145/3439817","DOIUrl":"https://doi.org/10.1145/3439817","url":null,"abstract":"Understanding the linkability of online user identifiers (IDs) is critical to both service providers (for business intelligence) and individual users (for assessing privacy risks). Existing methods are designed to match IDs across two services but face key challenges of matching multiple services in practice, particularly when users have multiple IDs per service. In this article, we propose a novel system to link IDs across multiple services by exploring the spatial-temporal features of user activities, of which the core idea is that the same user's online IDs are more likely to repeatedly appear at the same location. Specifically, we first utilize a contact graph to capture the “co-location” of all IDs across multiple services. Based on this graph, we propose a set-wise matching algorithm to discover candidate ID sets and use Bayesian inference to generate confidence scores for candidate ranking, which is proved to be optimal. We evaluate our system using two real-world ground-truth datasets from an Internet service provider (4 services, 815K IDs) and Twitter-Foursquare (2 services, 770 IDs). Extensive results show that our system significantly outperforms the state-of-the-art algorithms in accuracy (AUC is higher by 0.1–0.2), and it is highly robust against data quality, matching order, and number of services.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130431538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TLDS: A Transfer-Learning-Based Delivery Station Location Selection Pipeline","authors":"Chenyu Hou, Bin Cao, Sijie Ruan, Jing Fan","doi":"10.1145/3469084","DOIUrl":"https://doi.org/10.1145/3469084","url":null,"abstract":"Delivery stations play important roles in logistics systems. Well-designed delivery station planning can improve delivery efficiency significantly. However, existing delivery station locations are decided by experts, which requires much preliminary research and data collection work. It is not only time consuming but also expensive for logistics companies. Therefore, in this article, we propose a data-driven pipeline that can transfer expert knowledge among cities and automatically allocate delivery stations. Based on existing well-designed station location planning in the source city, we first train a model to learn the expert knowledge about delivery range selection for each station. Then we transfer the learned knowledge to a new city and design three strategies to select delivery stations for the new city. Due to the differences in characteristics among different cities, we adopt a transfer learning method to eliminate the domain difference so that the model can be adapted to a new city well. Finally, we conduct extensive experiments based on real-world datasets and find the proposed method can solve the problem well.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123044561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amin Vahedian Khezerlou, Xun Zhou, Xinyi Li, W. Street, Yanhua Li
{"title":"DILSA+: Predicting Urban Dispersal Events through Deep Survival Analysis with Enhanced Urban Features","authors":"Amin Vahedian Khezerlou, Xun Zhou, Xinyi Li, W. Street, Yanhua Li","doi":"10.1145/3469085","DOIUrl":"https://doi.org/10.1145/3469085","url":null,"abstract":"Urban dispersal events occur when an unexpectedly large number of people leave an area in a relatively short period of time. It is beneficial for the city authorities, such as law enforcement and city management, to have an advance knowledge of such events, as it can help them mitigate the safety risks and handle important challenges such as managing traffic, and so forth. Predicting dispersal events is also beneficial to Taxi drivers and/or ride-sharing services, as it will help them respond to an unexpected demand and gain competitive advantage. Large urban datasets such as detailed trip records and point of interest (POI) data make such predictions achievable. The related literature mainly focused on taxi demand prediction. The pattern of the demand was assumed to be repetitive and proposed methods aimed at capturing those patterns. However, dispersal events are, by definition, violations of those patterns and are, understandably, missed by the methods in the literature. We proposed a different approach in our prior work [32]. We showed that dispersal events can be predicted by learning the complex patterns of arrival and other features that precede them in time. We proposed a survival analysis formulation of this problem and proposed a two-stage framework (DILSA), where a deep learning model predicted the survival function at each point in time in the future. We used that prediction to determine the time of the dispersal event in the future, or its non-occurrence. However, DILSA is subject to a few limitations. First, based on evidence from the data, mobility patterns can vary through time at a given location. DILSA does not distinguish between different mobility patterns through time. Second, mobility patterns are also different for different locations. DILSA does not have the capability to directly distinguish between different locations based on their mobility patterns. In this article, we address these limitations by proposing a method to capture the interaction between POIs and mobility patterns and we create vector representations of locations based on their mobility patterns. We call our new method DILSA+. We conduct extensive case studies and experiments on the NYC Yellow taxi dataset from 2014 to 2016. Results show that DILSA+ can predict events in the next 5 hours with an F1-score of 0.66. It is significantly better than DILSA and the state-of-the-art deep learning approaches for taxi demand prediction.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121868798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reyhan Aydoğan, Özgür Kafali, Furkan Arslan, C. Jonker, Munindar P. Singh
{"title":"Nova: Value-based Negotiation of Norms","authors":"Reyhan Aydoğan, Özgür Kafali, Furkan Arslan, C. Jonker, Munindar P. Singh","doi":"10.1145/3465054","DOIUrl":"https://doi.org/10.1145/3465054","url":null,"abstract":"Specifying a normative multiagent system (nMAS) is challenging, because different agents often have conflicting requirements. Whereas existing approaches can resolve clear-cut conflicts, tradeoffs might occur in practice among alternative nMAS specifications with no apparent resolution. To produce an nMAS specification that is acceptable to each agent, we model the specification process as a negotiation over a set of norms. We propose an agent-based negotiation framework, where agents’ requirements are represented as values (e.g., patient safety, privacy, and national security), and an agent revises the nMAS specification to promote its values by executing a set of norm revision rules that incorporate ontology-based reasoning. To demonstrate that our framework supports creating a transparent and accountable nMAS specification, we conduct an experiment with human participants who negotiate against our agent. Our findings show that our negotiation agent reaches better agreements (with small p-value and large effect size) faster than a baseline strategy. Moreover, participants perceive that our agent enables more collaborative and transparent negotiations than the baseline (with small p-value and large effect size in particular settings) toward reaching an agreement.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122616067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anbu Huang, Yang Liu, Tianjian Chen, Yongkai Zhou, Quan Sun, Hongfeng Chai, Qiang Yang
{"title":"StarFL: Hybrid Federated Learning Architecture for Smart Urban Computing","authors":"Anbu Huang, Yang Liu, Tianjian Chen, Yongkai Zhou, Quan Sun, Hongfeng Chai, Qiang Yang","doi":"10.1145/3467956","DOIUrl":"https://doi.org/10.1145/3467956","url":null,"abstract":"From facial recognition to autonomous driving, Artificial Intelligence (AI) will transform the way we live and work over the next couple of decades. Existing AI approaches for urban computing suffer from various challenges, including dealing with synchronization and processing of vast amount of data generated from the edge devices, as well as the privacy and security of individual users, including their bio-metrics, locations, and itineraries. Traditional centralized-based approaches require data in each organization be uploaded to the central database, which may be prohibited by data protection acts, such as GDPR and CCPA. To decouple model training from the need to store the data in the cloud, a new training paradigm called Federated Learning (FL) is proposed. FL enables multiple devices to collaboratively learn a shared model while keeping the training data on devices locally, which can significantly mitigate privacy leakage risk. However, under urban computing scenarios, data are often communication-heavy, high-frequent, and asynchronized, posing new challenges to FL implementation. To handle these challenges, we propose a new hybrid federated learning architecture called StarFL. By combining with Trusted Execution Environment (TEE), Secure Multi-Party Computation (MPC), and (Beidou) satellites, StarFL enables safe key distribution, encryption, and decryption, and provides a verification mechanism for each participant to ensure the security of the local data. In addition, StarFL can provide accurate timestamp matching to facilitate synchronization of multiple clients. All these improvements make StarFL more applicable to the security-sensitive scenarios for the next generation of urban computing.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"65 12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126014230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stanley Ebhohimhen Abhadiomhen, Zhiyang Wang, Xiangjun Shen, Jianping Fan
{"title":"Multiview Common Subspace Clustering via Coupled Low Rank Representation","authors":"Stanley Ebhohimhen Abhadiomhen, Zhiyang Wang, Xiangjun Shen, Jianping Fan","doi":"10.1145/3465056","DOIUrl":"https://doi.org/10.1145/3465056","url":null,"abstract":"Multi-view subspace clustering (MVSC) finds a shared structure in latent low-dimensional subspaces of multi-view data to enhance clustering performance. Nonetheless, we observe that most existing MVSC methods neglect the diversity in multi-view data by considering only the common knowledge to find a shared structure either directly or by merging different similarity matrices learned for each view. In the presence of noise, this predefined shared structure becomes a biased representation of the different views. Thus, in this article, we propose a MVSC method based on coupled low-rank representation to address the above limitation. Our method first obtains a low-rank representation for each view, constrained to be a linear combination of the view-specific representation and the shared representation by simultaneously encouraging the sparsity of view-specific one. Then, it uses the k-block diagonal regularizer to learn a manifold recovery matrix for each view through respective low-rank matrices to recover more manifold structures from them. In this way, the proposed method can find an ideal similarity matrix by approximating clustering projection matrices obtained from the recovery structures. Hence, this similarity matrix denotes our clustering structure with exactly k connected components by applying a rank constraint on the similarity matrix’s relaxed Laplacian matrix to avoid spectral post-processing of the low-dimensional embedding matrix. The core of our idea is such that we introduce dynamic approximation into the low-rank representation to allow the clustering structure and the shared representation to guide each other to learn cleaner low-rank matrices that would lead to a better clustering structure. Therefore, our approach is notably different from existing methods in which the local manifold structure of data is captured in advance. Extensive experiments on six benchmark datasets show that our method outperforms 10 similar state-of-the-art compared methods in six evaluation metrics.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129302778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining Customers’ Changeable Electricity Consumption for Effective Load Forecasting","authors":"Etienne Gael Tajeuna, M. Bouguessa, Shengrui Wang","doi":"10.1145/3466684","DOIUrl":"https://doi.org/10.1145/3466684","url":null,"abstract":"Most existing approaches for electricity load forecasting perform the task based on overall electricity consumption. However, using such a global methodology can affect load forecasting accuracy, as it does not consider the possibility that customers’ consumption behavior may change at any time. Predicting customers’ electricity consumption in the presence of unstable behaviors poses challenges to existing models. In this article, we propose a principled approach capable of handling customers’ changeable electricity consumption. We devise a network-based method that first builds and tracks clusters of customer consumption patterns over time. Then, on the evolving clusters, we develop a framework that exploits long short-term memory recurrent neural network and survival analysis techniques to forecast electricity consumption. Our experiments on real electricity consumption datasets illustrate the suitability of the proposed approach.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115091818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Let Trajectories Speak Out the Traffic Bottlenecks","authors":"Hui Luo, Z. Bao, G. Cong, J. Culpepper, N. Khoa","doi":"10.1145/3465058","DOIUrl":"https://doi.org/10.1145/3465058","url":null,"abstract":"Traffic bottlenecks are a set of road segments that have an unacceptable level of traffic caused by a poor balance between road capacity and traffic volume. A huge volume of trajectory data which captures realtime traffic conditions in road networks provides promising new opportunities to identify the traffic bottlenecks. In this paper, we define this problem as trajectory-driven traffic bottleneck identification: Given a road network R, a trajectory database T, find a representative set of seed edges of size K of traffic bottlenecks that influence the highest number of road segments not in the seed set. We show that this problem is NP-hard and propose a framework to find the traffic bottlenecks as follows. First, a traffic spread model is defined which represents changes in traffic volume for each road segment over time. Then, the traffic diffusion probability between two connected segments and the residual ratio of traffic volume for each segment can be computed using historical trajectory data. We then propose two different algorithmic approaches to solve the problem. The first one is a best-first algorithm BF, with an approximation ratio of 1-1/e. To further accelerate the identification process in larger datasets, we also propose a sampling-based greedy algorithm SG. Finally, comprehensive experiments using three different datasets compare and contrast various solutions, and provide insights into important efficiency and effectiveness trade-offs among the respective methods.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"43 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120907289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Complementarity in Behavior Data with Multi-Type Itemset Embedding","authors":"Daheng Wang, Qingkai Zeng, N. Chawla, Meng Jiang","doi":"10.1145/3458724","DOIUrl":"https://doi.org/10.1145/3458724","url":null,"abstract":"People are looking for complementary contexts, such as team members of complementary skills for project team building and/or reading materials of complementary knowledge for effective student learning, to make their behaviors more likely to be successful. Complementarity has been revealed by behavioral sciences as one of the most important factors in decision making. Existing computational models that learn low-dimensional context representations from behavior data have poor scalability and recent network embedding methods only focus on preserving the similarity between the contexts. In this work, we formulate a behavior entry as a set of context items and propose a novel representation learning method, Multi-type Itemset Embedding, to learn the context representations preserving the itemset structures. We propose a measurement of complementarity between context items in the embedding space. Experiments demonstrate both effectiveness and efficiency of the proposed method over the state-of-the-art methods on behavior prediction and context recommendation. We discover that the complementary contexts and similar contexts are significantly different in human behaviors.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121891580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yasan Ding, Bin Guo, Y. Liu, Yunji Liang, Haocheng Shen, Zhiwen Yu
{"title":"MetaDetector: Meta Event Knowledge Transfer for Fake News Detection","authors":"Yasan Ding, Bin Guo, Y. Liu, Yunji Liang, Haocheng Shen, Zhiwen Yu","doi":"10.1145/3532851","DOIUrl":"https://doi.org/10.1145/3532851","url":null,"abstract":"The blooming of fake news on social networks has devastating impacts on society, the economy, and public security. Although numerous studies are conducted for the automatic detection of fake news, the majority tend to utilize deep neural networks to learn event-specific features for superior detection performance on specific datasets. However, the trained models heavily rely on the training datasets and are infeasible to apply to upcoming events due to the discrepancy between event distributions. Inspired by domain adaptation theories, we propose an end-to-end adversarial adaptation network, dubbed as MetaDetector, to transfer meta knowledge (event-shared features) between different events. Specifically, MetaDetector pushes the feature extractor and event discriminator to eliminate event-specific features and preserve required meta knowledge by adversarial training. Furthermore, the pseudo-event discriminator is utilized to evaluate the importance of news records in historical events to obtain partial knowledge that are discriminative for detecting fake news. Under the coordinated optimization among all the submodules, MetaDetector accurately transfers the meta knowledge of historical events to the upcoming event for fact checking. We conduct extensive experiments on two real-world datasets collected from Sina Weibo and Twitter. The experimental results demonstrate that MetaDetector outperforms the state-of-the-art methods, especially when the distribution discrepancy between events is significant.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123623502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}