{"title":"Improving the Information Disclosure in Mobility-on-Demand Systems","authors":"Yue Yang, Yuan Shi, Dejian Wang, Qisheng Chen, Lei Xu, Hanqian Li, Zhouyu Fu, Xin Li, Hao Zhang","doi":"10.1145/3447548.3467062","DOIUrl":"https://doi.org/10.1145/3447548.3467062","url":null,"abstract":"Nowadays, the ubiquity of sharing economy and the booming of ride-sharing services prompt Mobility-on-Demand (MoD) platforms to explore and develop new business modes. Different from forcing full-time drivers to serve the dispatched orders, these modes usually aim to attract part-time drivers to share their vehicles and employ a 'driver-choose-order' pattern by displaying a sequence of orders to drivers as a candidate set. A key issue here is to determine which orders should be displayed to each driver. In this work, we propose a novel framework to tackle this issue, known as the Information Disclosure problem in MoD systems. The problem is solved in two steps combining estimation with optimization: 1) in the estimation step, we investigate the drivers' choice behavior and estimate the probability of choosing an order or ignoring the displayed candidate set. 2) in the optimization step, we transform the problem into determining the optimal edge configuration in a bipartite graph, then we develop a Minimal-Loss Edge Cutting (MLEC) algorithm to solve it. Through extensive experiments on both the simulation and the real-world data from Huolala business, the proposed method remarkably improves users experience and platform efficiency. Based on these promising results, the proposed framework has been successfully deployed in the real-world MoD system in Huolala.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130713000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-graph Multi-label Learning with Dual-granularity Labeling","authors":"Yuhai Zhao, Yejiang Wang, Zhengkui Wang, Chengqi Zhang","doi":"10.1145/3447548.3467339","DOIUrl":"https://doi.org/10.1145/3447548.3467339","url":null,"abstract":"Graphs are a powerful and versatile data structure that easily captures real life relationship. Multi-graph Multi-label learning (MGML) is a supervised learning task, which aims to learn a Multi-label classifier to label a set of objects of interest (e.g. image or text) with a bag-of-graphs representation. However, prior techniques on the MGML are developed based on transferring graphs into instances that does not fully utilize the structure information in the learning, and focus on learning the unseen labels only at the bag level. There is no existing work studying how to label the graphs within a bag that is of importance in many applications like image or text annotation. To bridge this gap, in this paper, we present a novel coarse and fine-grained Multi-graph Multi-label (cfMGML) learning framework which directly builds the learning model over the graphs and empowers the label prediction at both the coarse (aka. bag) level and fine-grained (aka. graph in each bag) level. In particular, given a set of labeled multi-graph bags, we design the scoring functions at both graph and bag levels to model the relevance between the label and data using specific graph kernels. Meanwhile, we propose a thresholding rank-loss objective function to rank the labels for the graphs and bags and minimize the hamming-loss simultaneously at one-step, which aims to address the error accumulation issue in traditional rank-loss algorithms. To tackle the non-convex optimization problem, we further develop an effective sub-gradient descent algorithm to handle high-dimensional space computation required in cfMGML. Experiments over various real-world datasets demonstrate cfMGML achieves superior performance than the state-of-arts algorithms.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132575671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peng Han, Jin Wang, Di Yao, Shuo Shang, Xiangliang Zhang
{"title":"A Graph-based Approach for Trajectory Similarity Computation in Spatial Networks","authors":"Peng Han, Jin Wang, Di Yao, Shuo Shang, Xiangliang Zhang","doi":"10.1145/3447548.3467337","DOIUrl":"https://doi.org/10.1145/3447548.3467337","url":null,"abstract":"Trajectory similarity computation is an essential operation in many applications of spatial data analysis. In this paper, we study the problem of trajectory similarity computation over spatial network, where the real distances between objects are reflected by the network distance. Unlike previous studies which learn the representation of trajectories in Euclidean space, it requires to capture not only the sequence information of the trajectory but also the structure of spatial network. To this end, we propose GTS, a brand new framework that can jointly learn both factors so as to accurately compute the similarity. It first learns the representation of each point-of-interest (POI) in the road network along with the trajectory information. This is realized by incorporating the distances between POIs and trajectory in the random walk over the spatial network as well as the loss function. Then the trajectory representation is learned by a Graph Neural Network model to identify neighboring POIs within the same trajectory, together with an LSTM model to capture the sequence information in the trajectory. We conduct comprehensive evaluation on several real world datasets. The experimental results demonstrate that our model substantially outperforms all existing approaches.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132746202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"New Frontiers of Multi-Network Mining: Recent Developments and Future Trend","authors":"Boxin Du, Si Zhang, Yuchen Yan, Hanghang Tong","doi":"10.1145/3447548.3470801","DOIUrl":"https://doi.org/10.1145/3447548.3470801","url":null,"abstract":"Networks (i.e., graphs) are often collected from multiple sources and platforms, such as social networks extracted from multiple online platforms, team-specific collaboration networks within an organization, and inter-dependent infrastructure networks, etc. Such networks from different sources form the multi-networks, which can exhibit the unique patterns that are invisible if we mine the individual network separately. However, compared with single-network mining, multi-network mining is still under-explored due to its unique challenges. First ( multi-network models ), networks under different circumstances can be modeled into a variety of models. How to properly build multi-network models from the complex data? Second ( multi-network mining algorithms ), it is often nontrivial to either extend single-network mining algorithms to multi-networks or design new algorithms. How to develop effective and efficient mining algorithms on multi-networks? The objectives of this tutorial are to: (1) comprehensively review the existing multi-network models, (2) elaborate the techniques in multi-network mining with a special focus on recent advances, and (3) elucidate open challenges and future research directions. We believe this tutorial could be beneficial to various application domains, and attract researchers and practitioners from data mining as well as other interdisciplinary fields.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133117789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Linjun Shou, Ming Gong, J. Pei, Xiubo Geng, Xingjie Zhou, Daxin Jiang
{"title":"Language Scaling: Applications, Challenges and Approaches","authors":"Linjun Shou, Ming Gong, J. Pei, Xiubo Geng, Xingjie Zhou, Daxin Jiang","doi":"10.1145/3447548.3470791","DOIUrl":"https://doi.org/10.1145/3447548.3470791","url":null,"abstract":"Language scaling aims to deploy Natural Language Processing (NLP) applications economically across many countries/regions with different languages. Language scaling has been heavily invested by industry since many parties want to deploy their applications/services to global markets. At the same time, scaling out NLP applications to various languages, essentially a data science problem, remains a grand challenge due to the huge differences in the morphology, syntaxes, and pragmatics among different languages. We present a comprehensive survey and tutorial on language scaling. We start with a clear problem description for language scaling and an intuitive discussion on the overall challenges. Then, we outline two major categories of approaches to language scaling, namely, model transfer and data transfer. We present a taxonomy to summarize various methods in literature. A large part of the tutorial is organized to address various types of NLP applications. Finally, we discuss several important challenges in this area and future directions.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131322174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fedor Borisyuk, Siddarth Malreddy, Jun Mei, Yiqun Liu, Xiaoyi Liu, P. Maheshwari, A. Bell, Kaushik Rangadurai
{"title":"VisRel: Media Search at Scale","authors":"Fedor Borisyuk, Siddarth Malreddy, Jun Mei, Yiqun Liu, Xiaoyi Liu, P. Maheshwari, A. Bell, Kaushik Rangadurai","doi":"10.1145/3447548.3467081","DOIUrl":"https://doi.org/10.1145/3447548.3467081","url":null,"abstract":"In this paper, we present VisRel, a deployed large-scale media search system that leverages text understanding, media understanding, and multimodal technologies to deliver a modern multimedia search experience. We share our insight on developing image and video understanding models for content retrieval, training efficient and effective media-to-query relevance models, and refining online and offline metrics to measure the success of one of the largest media search databases in the industry. We summarize our learnings gathered from hundreds of A/B test experiments and describe the most effective technical approaches. The techniques presented in this work have contributed 34% (abs.) improvement to media-to-query relevance and 10% improvement to user engagement. We believe that this work can provide practical solutions and insights for engineers who are interested in applying media understanding technologies to empower multimedia search systems that operate at Facebook scale.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115374038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Forecasting Interaction Order on Temporal Graphs","authors":"Wenwen Xia, Yuchen Li, Jianwei Tian, Shenghong Li","doi":"10.1145/3447548.3467341","DOIUrl":"https://doi.org/10.1145/3447548.3467341","url":null,"abstract":"Link prediction is a fundamental task for graph analysis and the topic has been studied extensively for static or dynamic graphs. Essentially, the link prediction is formulated as a binary classification problem about two nodes. However, for temporal graphs, links (or interactions) among node sets appear in sequential orders. And the orders may lead to interesting applications. While a binary link prediction formulation fails to handle such an order-sensitive case. In this paper, we focus on such an interaction order prediction problem among a given node set on temporal graphs. For the technical aspect, we develop a graph neural network model named Temporal ATtention network (TAT), which utilizes the fine-grained time information on temporal graphs by encoding continuous real-valued timestamps as vectors. For each transformation layer of the model, we devise an attention mechanism to aggregate neighborhoods' information based on their representations and time encodings attached to their specific edges. We also propose a novel training scheme to address the permutation-sensitive property of the problem. Experiments on several real-world temporal graphs reveal that TAT outperforms some state-of-the-art graph neural networks by 55% on average under the AUC metric.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114618510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiquan Cui, Estelle Afshar, Khalifeh Al-Jadda, Srijan Kumar, Julian McAuley, Tao Ye, Kamelia Aryafar, Vachik S. Dave, Mohammad Korayem
{"title":"Workshop on Online and Adaptative Recommender Systems (OARS)","authors":"Xiquan Cui, Estelle Afshar, Khalifeh Al-Jadda, Srijan Kumar, Julian McAuley, Tao Ye, Kamelia Aryafar, Vachik S. Dave, Mohammad Korayem","doi":"10.1145/3447548.3469472","DOIUrl":"https://doi.org/10.1145/3447548.3469472","url":null,"abstract":"Many recommender systems deployed in the real world rely on categorical user-profiles and/or pre-calculated recommendation actions that stay static during a user session. Recent trends suggest that recommender systems should model user intent in real time and constantly adapt to meet user needs at the moment or change user behavior in situ. In addition, there have been many advances that make online and adaptive recommender systems (OARS) feasible, scalable, and more sophisticated. This workshop aims to bring together practitioners and researchers from academia and industry to discuss the challenges and approaches to implement OARS algorithms and systems and improve user experiences by better modeling and responding to user intent.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117289120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yao Ma, Suhang Wang, Tyler Derr, Lingfei Wu, Jiliang Tang
{"title":"Graph Adversarial Attack via Rewiring","authors":"Yao Ma, Suhang Wang, Tyler Derr, Lingfei Wu, Jiliang Tang","doi":"10.1145/3447548.3467416","DOIUrl":"https://doi.org/10.1145/3447548.3467416","url":null,"abstract":"Graph Neural Networks (GNNs) have demonstrated their powerful capability in learning representations for graph-structured data. Consequently, they have enhanced the performance of many graph-related tasks such as node classification and graph classification. However, it is evident from recent studies that GNNs are vulnerable to adversarial attacks. Their performance can be largely impaired by deliberately adding carefully created unnoticeable perturbations to the graph. Existing attacking methods often produce perturbation by adding/deleting a few edges, which might be noticeable even when the number of modified edges is small. In this paper, we propose a graph rewiring operation to perform the attack. It can affect the graph in a less noticeable way compared to existing operations such as adding/deleting edges. We then utilize deep reinforcement learning to learn the strategy to effectively perform the rewiring operations. Experiments on real-world graphs demonstrate the effectiveness of the proposed framework. To understand the proposed framework, we further analyze how its generated perturbation impacts the target model and the advantages of the rewiring operations. The implementation of the proposed framework is available at https://github.com/alge24/ReWatt.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123485840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vasilis Syrgkanis, Greg Lewis, M. Oprescu, Maggie Hei, Keith Battocchi, Eleanor Dillon, Jing Pan, Yifeng Wu, Paul Lo, Huigang Chen, Totte Harinen, Jeong-Yoon Lee
{"title":"Causal Inference and Machine Learning in Practice with EconML and CausalML: Industrial Use Cases at Microsoft, TripAdvisor, Uber","authors":"Vasilis Syrgkanis, Greg Lewis, M. Oprescu, Maggie Hei, Keith Battocchi, Eleanor Dillon, Jing Pan, Yifeng Wu, Paul Lo, Huigang Chen, Totte Harinen, Jeong-Yoon Lee","doi":"10.1145/3447548.3470792","DOIUrl":"https://doi.org/10.1145/3447548.3470792","url":null,"abstract":"In recent years, both academic research and industry applications see an increased effort in using machine learning methods to measure granular causal effects and design optimal policies based on these causal estimates. Open source packages such as CausalML and EconML provide a unified interface for applied researchers and industry practitioners with a variety of machine learning methods for causal inference. The tutorial will cover the topics including conditional treatment effect estimators by meta-learners and tree-based algorithms, model validations and sensitivity analysis, optimization algorithms including policy leaner and cost optimization. In addition, the tutorial will demonstrate the production of these algorithms in industry use cases.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"571 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123925708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}