A. Chenreddy, Parshan Pakiman, Selvaprabu Nadarajah, Ranganathan Chandrasekaran, Rick Abens
{"title":"SMOILE","authors":"A. Chenreddy, Parshan Pakiman, Selvaprabu Nadarajah, Ranganathan Chandrasekaran, Rick Abens","doi":"10.1145/3292500.3330788","DOIUrl":"https://doi.org/10.1145/3292500.3330788","url":null,"abstract":"Product brands employ shopper marketing (SM) strategies to convert shoppers along the path to purchase. Traditional marketing mix models (MMMs), which leverage regression techniques and historical data, can be used to predict the component of sales lift due to SM tactics. The resulting predictive model is a critical input to plan future SM strategies. The implementation of traditional MMMs, however, requires significant ad-hoc manual intervention due to their limited flexibility in (i) explicitly capturing the temporal link between decisions; (ii) accounting for the interaction between business rules and past (sales and decision) data during the attribution of lift to SM; and (iii) ensuring that future decisions adhere to business rules. These issues necessitate MMMs with tailored structures for specific products and retailers, each requiring significant hand-engineering to achieve satisfactory performance -- a major implementation challenge. We propose an SM Optimization and Inverse Learning Engine (SMOILE) that combines optimization and inverse reinforcement learning to streamline implementation. SMOILE learns a model of lift by viewing SM tactic choice as a sequential process, leverages inverse reinforcement learning to explicitly couple sales and decision data, and employs an optimization approach to handle a wide-array of business rules. Using a unique dataset containing sales and SM spend information across retailers and products, we illustrate how SMOILE standardizes the use of data to prescribe future SM decisions. We also track an industry benchmark to showcase the importance of encoding SM lift and decision structures to mitigate spurious results when uncovering the impact of SM decisions.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"183 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127034336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient and Effective Express via Contextual Cooperative Reinforcement Learning","authors":"Yexin Li, Yu Zheng, Qiang Yang","doi":"10.1145/3292500.3330968","DOIUrl":"https://doi.org/10.1145/3292500.3330968","url":null,"abstract":"Express systems are widely deployed in many major cities. Couriers in an express system load parcels at transit station and deliver them to customers. Meanwhile, they also try to serve the pick-up requests which come stochastically in real time during the delivery process. Having brought much convenience and promoted the development of e-commerce, express systems face challenges on courier management to complete the massive number of tasks per day. Considering this problem, we propose a reinforcement learning based framework to learn a courier management policy. Firstly, we divide the city into independent regions, in each of which a constant number of couriers deliver parcels and serve requests cooperatively. Secondly, we propose a soft-label clustering algorithm named Balanced Delivery-Service Burden (BDSB) to dispatch parcels to couriers in each region. BDSB guarantees that each courier has almost even delivery and expected request-service burden when departing from transit station, giving a reasonable initialization for online management later. As pick-up requests come in real time, a Contextual Cooperative Reinforcement Learning (CCRL) model is proposed to guide where should each courier deliver and serve in each short period. Being formulated in a multi-agent way, CCRL focuses on the cooperation among couriers while also considering the system context. Experiments on real-world data from Beijing are conducted to confirm the outperformance of our model.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127092311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seong Jae Hwang, Joonseok Lee, Balakrishnan Varadarajan, A. Gordon, Zheng Xu, A. Natsev
{"title":"Large-Scale Training Framework for Video Annotation","authors":"Seong Jae Hwang, Joonseok Lee, Balakrishnan Varadarajan, A. Gordon, Zheng Xu, A. Natsev","doi":"10.1145/3292500.3330653","DOIUrl":"https://doi.org/10.1145/3292500.3330653","url":null,"abstract":"Video is one of the richest sources of information available online but extracting deep insights from video content at internet scale is still an open problem, both in terms of depth and breadth of understanding, as well as scale. Over the last few years, the field of video understanding has made great strides due to the availability of large-scale video datasets and core advances in image, audio, and video modeling architectures. However, the state-of-the-art architectures on small scale datasets are frequently impractical to deploy at internet scale, both in terms of the ability to train such deep networks on hundreds of millions of videos, and to deploy them for inference on billions of videos. In this paper, we present a MapReduce-based training framework, which exploits both data parallelism and model parallelism to scale training of complex video models. The proposed framework uses alternating optimization and full-batch fine-tuning, and supports large Mixture-of-Experts classifiers with hundreds of thousands of mixtures, which enables a trade-off between model depth and breadth, and the ability to shift model capacity between shared (generalization) layers and per-class (specialization) layers. We demonstrate that the proposed framework is able to reach state-of-the-art performance on the largest public video datasets, YouTube-8M and Sports-1M, and can scale to 100 times larger datasets.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"2175 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130079575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuandong Wang, Hongzhi Yin, Hongxu Chen, Tianyu Wo, Jie Xu, Kai Zheng
{"title":"Origin-Destination Matrix Prediction via Graph Convolution: a New Perspective of Passenger Demand Modeling","authors":"Yuandong Wang, Hongzhi Yin, Hongxu Chen, Tianyu Wo, Jie Xu, Kai Zheng","doi":"10.1145/3292500.3330877","DOIUrl":"https://doi.org/10.1145/3292500.3330877","url":null,"abstract":"Ride-hailing applications are becoming more and more popular for providing drivers and passengers with convenient ride services, especially in metropolises like Beijing or New York. To obtain the passengers' mobility patterns, the online platforms of ride services need to predict the number of passenger demands from one region to another in advance. We formulate this problem as an Origin-Destination Matrix Prediction (ODMP) problem. Though this problem is essential to large-scale providers of ride services for helping them make decisions and some providers have already put it forward in public, existing studies have not solved this problem well. One of the main reasons is that the ODMP problem is more challenging than the common demand prediction. Besides the number of demands in a region, it also requires the model to predict the destinations of them. In addition, data sparsity is a severe issue. To solve the problem effectively, we propose a unified model, Grid-Embedding based Multi-task Learning (GEML) which consists of two components focusing on spatial and temporal information respectively. The Grid-Embedding part is designed to model the spatial mobility patterns of passengers and neighboring relationships of different areas, the pre-weighted aggregator of which aims to sense the sparsity and range of data. The Multi-task Learning framework focuses on modeling temporal attributes and capturing several objectives of the ODMP problem. The evaluation of our model is conducted on real operational datasets from UCAR and Didi. The experimental results demonstrate the superiority of our GEML against the state-of-the-art approaches.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130177727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Zou, Kun Kuang, Boqi Chen, Peixuan Chen, Peng Cui
{"title":"Focused Context Balancing for Robust Offline Policy Evaluation","authors":"Hao Zou, Kun Kuang, Boqi Chen, Peixuan Chen, Peng Cui","doi":"10.1145/3292500.3330852","DOIUrl":"https://doi.org/10.1145/3292500.3330852","url":null,"abstract":"Precisely evaluating the effect of new policies (e.g. ad-placement models, recommendation functions, ranking functions) is one of the most important problems for improving interactive systems. The conventional policy evaluation methods rely on online A/B tests, but they are usually extremely expensive and may have undesirable impacts. Recently, Inverse Propensity Score (IPS) estimators are proposed as alternatives to evaluate the effect of new policy with offline logged data that was collected from a different policy in the past. They tend to remove the distribution shift induced by past policy. However, they ignore the distribution shift that would be induced by the new policy, which results in imprecise evaluation. Moreover, their performances rely on accurate estimation of propensity score, which can not be guaranteed or validated in practice. In this paper, we propose a non-parametric method, named Focused Context Balancing (FCB) algorithm, to learn sample weights for context balancing, so that the distribution shift induced by the past policy and new policy can be eliminated respectively. To validate the effectiveness of our FCB algorithm, we conduct extensive experiments on both synthetic and real world datasets. The experimental results clearly demonstrate that our FCB algorithm outperforms existing estimators by achieving more precise and robust results for offline policy evaluation.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132495902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reserve Price Failure Rate Prediction with Header Bidding in Display Advertising","authors":"Achir Kalra, Chong Wang, C. Borcea, Yi Chen","doi":"10.1145/3292500.3330729","DOIUrl":"https://doi.org/10.1145/3292500.3330729","url":null,"abstract":"The revenue of online display advertising in the U.S. is projected to be 7.9 billion U.S. dollars by 2022. One main way of display advertising is through real-time bidding (RTB). In RTB, an ad exchange runs a second price auction among multiple advertisers to sell each ad impression. Publishers usually set up a reserve price, the lowest price acceptable for an ad impression. If there are bids higher than the reserve price, then the revenue is the higher price between the reserve price and the second highest bid; otherwise, the revenue is zero. Thus, a higher reserve price can potentially increase the revenue, but with higher risks associated. In this paper, we study the problem of estimating the failure rate of a reserve price, i.e., the probability that a reserve price fails to be outbid. The solution to this problem have managerial implications to publishers to set appropriate reserve prices in order to minimizes the risks and optimize the expected revenue. This problem is highly challenging since most publishers do not know the historical highest bidding prices offered by RTB advertisers. To address this problem, we develop a parametric survival model for reserve price failure rate prediction. The model is further improved by considering user and page interactions, and header bidding information. The experimental results demonstrate the effectiveness of the proposed approach.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132981788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Disambiguation Enabled Linear Discriminant Analysis for Partial Label Dimensionality Reduction","authors":"Jing-Han Wu, Min-Ling Zhang","doi":"10.1145/3292500.3330901","DOIUrl":"https://doi.org/10.1145/3292500.3330901","url":null,"abstract":"Partial label learning is an emerging weakly-supervised learning framework where each training example is associated with multiple candidate labels among which only one is valid. Dimensionality reduction serves as an effective way to help improve the generalization ability of learning system, while the task of partial label dimensionality reduction is challenging due to the unknown ground-truth labeling information. In this paper, the first attempt towards partial label dimensionality reduction is investigated by endowing the popular linear discriminant analysis (LDA) techniques with the ability of dealing with partial label training examples. Specifically, a novel learning procedure named DELIN is proposed which alternates between LDA dimensionality reduction and candidate label disambiguation based on estimated labeling confidences over candidate labels. On one hand, the projection matrix of LDA is optimized by utilizing disambiguation-guided labeling confidences. On the other hand, the labeling confidences are disambiguated by resorting to kNN aggregation in the LDA-induced feature space. Extensive experiments on synthetic as well as real-world partial label data sets clearly validate the effectiveness of DELIN in improving the generalization ability of state-of-the-art partial label learning algorithms.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129249624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aleksander Fabijan, J. Gupchup, Somit Gupta, Jeff Omhover, Wen Qin, Lukas Vermeer, Pavel A. Dmitriev
{"title":"Diagnosing Sample Ratio Mismatch in Online Controlled Experiments: A Taxonomy and Rules of Thumb for Practitioners","authors":"Aleksander Fabijan, J. Gupchup, Somit Gupta, Jeff Omhover, Wen Qin, Lukas Vermeer, Pavel A. Dmitriev","doi":"10.1145/3292500.3330722","DOIUrl":"https://doi.org/10.1145/3292500.3330722","url":null,"abstract":"Accurately learning what delivers value to customers is difficult. Online Controlled Experiments (OCEs), aka A/B tests, are becoming a standard operating procedure in software companies to address this challenge as they can detect small causal changes in user behavior due to product modifications (e.g. new features). However, like any data analysis method, OCEs are sensitive to trustworthiness and data quality issues which, if go unaddressed or unnoticed, may result in making wrong decisions. One of the most useful indicators of a variety of data quality issues is a Sample Ratio Mismatch (SRM) ? the situation when the observed sample ratio in the experiment is different from the expected. Just like fever is a symptom for multiple types of illness, an SRM is a symptom for a variety of data quality issues. While a simple statistical check is used to detect an SRM, correctly identifying the root cause and preventing it from happening in the future is often extremely challenging and time consuming. Ignoring the SRM without knowing the root cause may result in a bad product modification appearing to be good and getting shipped to users, or vice versa. The goal of this paper is to make diagnosing, fixing, and preventing SRMs easier. Based on our experience of running OCEs in four different software companies in over 25 different products used by hundreds of millions of users worldwide, we have derived a taxonomy for different types of SRMs. We share examples, detection guidelines, and best practices for preventing SRMs of each type. We hope that the lessons and practical tips we describe in this paper will speed up SRM investigations and prevent some of them. Ultimately, this should lead to improved decision making based on trustworthy experiment analysis.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128791290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transportation","authors":"Jieping Ye","doi":"10.1145/3292500.3340406","DOIUrl":"https://doi.org/10.1145/3292500.3340406","url":null,"abstract":"As decarbonisation is becoming increasingly important, many countries have placed an emphasis on decarbonising their transportation sector through electrification to support the transition to net zero. As such, research regarding the adoption of electric vehicles has drastically increased in recent years. Mathematical modelling plays an important role in optimising a transition to electric vehicles. This article describes a systematic literature review of existing works which perform mathematical modelling of the adoption of electric motor vehicles. In this study, 53 articles containing mathematical models of electric vehicle adoption are reviewed systematically to answer 6 research questions regarding the process of modelling transitions to electric vehicles. The mathematical modelling techniques observed in existing literature are discussed, along with the main barriers to electric vehicle adoption, and future research directions are suggested.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"280 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123307424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical Multi-Task Word Embedding Learning for Synonym Prediction","authors":"Hongliang Fei, Shulong Tan, Ping Li","doi":"10.1145/3292500.3330914","DOIUrl":"https://doi.org/10.1145/3292500.3330914","url":null,"abstract":"Automatic synonym recognition is of great importance for entity-centric text mining and interpretation. Due to the high language use variability in real-life, manual construction of semantic resources to cover all synonyms is prohibitively expensive and may also result in limited coverage. Although there are public knowledge bases, they only have limited coverage for languages other than English. In this paper, we focus on medical domain and propose an automatic way to accelerate the process of medical synonymy resource development for Chinese, including both formal entities from healthcare professionals and noisy descriptions from end-users. Motivated by the success of distributed word representations, we design a multi-task model with hierarchical task relationship to learn more representative entity/term embeddings and apply them to synonym prediction. In our model, we extend the classical skip-gram word embedding model by introducing an auxiliary task \"neighboring word semantic type prediction'' and hierarchically organize them based on the task complexity. Meanwhile, we incorporate existing medical term-term synonymous knowledge into our word embedding learning framework. We demonstrate that the embeddings trained from our proposed multi-task model yield significant improvement for entity semantic relatedness evaluation, neighboring word semantic type prediction and synonym prediction compared with baselines. Furthermore, we create a large medical text corpus in Chinese that includes annotations for entities, descriptions and synonymous pairs for future research in this direction.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126419338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}