Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining最新文献_第3页

Deep Natural Language Processing for Search and Recommender Systems 搜索和推荐系统的深度自然语言处理

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3332290

Weiwei Guo, Huiji Gao, Jun Shi, Bo Long, Liang Zhang, Bee-Chung Chen, D. Agarwal

引用次数: 17

Testing Dynamic Incentive Compatibility in Display Ad Auctions 展示广告拍卖的动态激励兼容性测试

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330943

Yuan Deng, Sébastien Lahaie

{"title":"Testing Dynamic Incentive Compatibility in Display Ad Auctions","authors":"Yuan Deng, Sébastien Lahaie","doi":"10.1145/3292500.3330943","DOIUrl":"https://doi.org/10.1145/3292500.3330943","url":null,"abstract":"The question of transparency has become a key point of contention between buyers and sellers of display advertising space: ads are allocated via complex, black-box auction systems whose mechanics can be difficult to model let alone optimize against. Motivated by this concern, this paper takes the perspective of a single advertiser and develops statistical tests to confirm whether an underlying auction mechanism is dynamically incentive compatible (IC), so that truthful bidding in each individual auction and across time is an optimal strategy. The most general notion of dynamic-IC presumes that the seller knows how buyers discount future surplus, which is questionable in practice. We characterize dynamic mechanisms that are dynamic-IC for all possible discounting factors according to two intuitive conditions: the mechanism should be IC at each stage in the usual sense, and expected present utility (under truthful bidding) should be independent of past bids. The conditions motivate two separate experiments based on bid perturbations that can be run simultaneously on the same impression traffic. We provide a novel statistical test of stage-IC along with a test for utility-independence that can detect lags in how the seller uses past bid information. We evaluate our tests on display ad data from a major ad exchange and show how they can accurately uncover evidence of first- or second-price auctions coupled with dynamic reserve prices, among other types of dynamic mechanisms.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126365749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Internal Promotion Optimization 内部推广优化

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330715

Rupesh Gupta, Guangde Chen, Shipeng Yu

{"title":"Internal Promotion Optimization","authors":"Rupesh Gupta, Guangde Chen, Shipeng Yu","doi":"10.1145/3292500.3330715","DOIUrl":"https://doi.org/10.1145/3292500.3330715","url":null,"abstract":"Most large Internet companies run internal promotions to cross-promote their different products and/or to educate members on how to obtain additional value from the products that they already use. This in turn drives engagement and/or revenue for the company. However, since these internal promotions can distract a member away from the product or page where these are shown, there is a non-zero cannibalization loss incurred for showing these internal promotions. This loss has to be carefully weighed against the gain from showing internal promotions. This can be a complex problem if different internal promotions optimize for different objectives. In that case, it is difficult to compare not just the gain from a conversion through an internal promotion against the loss incurred for showing that internal promotion, but also the gains from conversions through different internal promotions. Hence, we need a principled approach for deciding which internal promotion (if any) to serve to a member in each opportunity to serve an internal promotion. This approach should optimize not just for the net gain to the company, but also for the member's experience. In this paper, we discuss our approach for optimization of internal promotions at LinkedIn. In particular, we present a cost-benefit analysis of showing internal promotions, our formulation of internal promotion optimization as a constrained optimization problem, the architecture of the system for solving the optimization problem and serving internal promotions in real-time, and experimental results from online A/B tests.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126217289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

TrajGuard TrajGuard

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330685

Zheyi Pan, J. Bao, Weinan Zhang, Yŏng-ik Yu, Yu Zheng

{"title":"TrajGuard","authors":"Zheyi Pan, J. Bao, Weinan Zhang, Yŏng-ik Yu, Yu Zheng","doi":"10.1145/3292500.3330685","DOIUrl":"https://doi.org/10.1145/3292500.3330685","url":null,"abstract":"Trajectory data has been widely used in many urban applications. Sharing trajectory data with effective supervision is a vital task, as it contains private information of moving objects. However, malicious data users can modify trajectories in various ways to avoid data distribution tracking by the hashing-based data signatures, e.g., MD5. Moreover, the existing trajectory data protection scheme can only protect trajectories from either spatial or temporal modifications. Finally, so far there is no authoritative third party for trajectory data sharing process, as trajectory data is too sensitive. To this end, we propose a novel trajectory copyright protection scheme, which can protect trajectory data from comprehensive types of data modifications/attacks. Three main techniques are employed to effectively guarantee the robustness and comprehensiveness of the proposed data sharing scheme: 1) the identity information is embedded distributively across a set of sub-trajectories partitioned based on the spatio-temporal regions; 2) the centroid distance of the sub-trajectories is served as a stable trajectory attribute to embed the information; and 3) the blockchain technique is used as a trusted third party to log all data transaction history for data distribution tracking in a decentralized manner. Extensive experiments were conducted based on two real-world trajectory datasets to demonstrate the effectiveness of our proposed scheme.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115837566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Smart Roles: Inferring Professional Roles in Email Networks 智能角色:推断电子邮件网络中的专业角色

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330735

Di Jin, Mark Heimann, Tara Safavi, Mengdi Wang, Wei Lee, Lindsay Snider, Danai Koutra

{"title":"Smart Roles: Inferring Professional Roles in Email Networks","authors":"Di Jin, Mark Heimann, Tara Safavi, Mengdi Wang, Wei Lee, Lindsay Snider, Danai Koutra","doi":"10.1145/3292500.3330735","DOIUrl":"https://doi.org/10.1145/3292500.3330735","url":null,"abstract":"Email is ubiquitous in the workplace. Naturally, machine learning models that make third-party email clients \"smarter\" can dramatically impact employees' productivity and efficiency. Motivated by this potential, we study the task of professional role inference from email data, which is crucial for email prioritization and contact recommendation systems. The central question we address is: Given limited data about employees, as is common in third-party email applications, can we infer where in the organizational hierarchy these employees belong based on their email behavior? Toward our goal, in this paper we study professional role inference on a unique new email dataset comprising billions of email exchanges across thousands of organizations. Taking a network approach in which nodes are employees and edges represent email communication, we propose EMBER, or EMBedding Email-based Roles, which finds email-centric embeddings of network nodes to be used in professional role inference tasks. EMBER automatically captures behavioral similarity between employees in the email network, leading to embeddings that naturally distinguish employees of different hierarchical roles. EMBER often outperforms the state-of-the-art by 2-20% in role inference accuracy and 2.5-344x in speed. We also use EMBER with our unique dataset to study how inferred professional roles compare between organizations of different sizes and sectors, gaining new insights into organizational hierarchy.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129999833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Pythia: AI-assisted Code Completion System 皮媞亚:人工智能辅助代码完成系统

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330699

Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, Neel Sundaresan

引用次数: 109

E.T.-RNN: Applying Deep Learning to Credit Loan Applications E.T.-RNN:将深度学习应用于信用贷款申请

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330693

Dmitrii Babaev, M. Savchenko, A. Tuzhilin, Dmitrii Umerenkov

引用次数: 69

AI for Small Businesses and Consumers: Applications and Innovations 面向小型企业和消费者的人工智能:应用与创新

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3340398

Ashok Srivastava

引用次数: 0

Co-Prediction of Multiple Transportation Demands Based on Deep Spatio-Temporal Neural Network 基于深度时空神经网络的多重交通需求协同预测

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330887

Junchen Ye, Leilei Sun, Bowen Du, Yanjie Fu, Xinran Tong, Hui Xiong

{"title":"Co-Prediction of Multiple Transportation Demands Based on Deep Spatio-Temporal Neural Network","authors":"Junchen Ye, Leilei Sun, Bowen Du, Yanjie Fu, Xinran Tong, Hui Xiong","doi":"10.1145/3292500.3330887","DOIUrl":"https://doi.org/10.1145/3292500.3330887","url":null,"abstract":"Taxi and sharing bike bring great convenience to urban transportation. A lot of efforts have been made to improve the efficiency of taxi service or bike sharing system by predicting the next-period pick-up or drop-off demand. Different from the existing research, this paper is motivated by the following two facts: 1) From a micro view, an observed spatial demand at any time slot could be decomposed as a combination of many hidden spatial demand bases; 2) From a macro view, the multiple transportation demands are strongly correlated with each other, both spatially and temporally. Definitely, the above two views have great potential to revolutionize the existing taxi or bike demand prediction methods. Along this line, this paper provides a novel Co-prediction method based on Spatio-Temporal neural Network, namely, CoST-Net. In particular, a deep convolutional neural network is constructed to decompose a spatial demand into a combination of hidden spatial demand bases. The combination weight vector is used as a representation of the decomposed spatial demand. Then, a heterogeneous Long Short-Term Memory (LSTM) is proposed to integrate the states of multiple transportation demands, and also model the dynamics of them mixedly. Last, the environmental features such as humidity and temperature are incorporated with the achieved overall hidden states to predict the multiple demands simultaneously. Experiments have been conducted on real-world taxi and sharing bike demand data, results demonstrate the superiority of the proposed method over both classical and the state-of-the-art transportation demand prediction methods.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131697148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 91

Do Simpler Models Exist and How Can We Find Them? 是否存在更简单的模型，我们如何找到它们?

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330823

C. Rudin

{"title":"Do Simpler Models Exist and How Can We Find Them?","authors":"C. Rudin","doi":"10.1145/3292500.3330823","DOIUrl":"https://doi.org/10.1145/3292500.3330823","url":null,"abstract":"While the trend in machine learning has tended towards more complex hypothesis spaces, it is not clear that this extra complexity is always necessary or helpful for many domains. In particular, models and their predictions are often made easier to understand by adding interpretability constraints. These constraints shrink the hypothesis space; that is, they make the model simpler. Statistical learning theory suggests that generalization may be improved as a result as well. However, adding extra constraints can make optimization (exponentially) harder. For instance it is much easier in practice to create an accurate neural network than an accurate and sparse decision tree. We address the following question: Can we show that a simple-but-accurate machine learning model might exist for our problem, before actually finding it? If the answer is promising, it would then be worthwhile to solve the harder constrained optimization problem to find such a model. In this talk, I present an easy calculation to check for the possibility of a simpler model. This calculation indicates that simpler-but-accurate models do exist in practice more often than you might think. I then briefly overview several new methods for interpretable machine learning. These methods are for (i) sparse optimal decision trees, (ii) sparse linear integer models (also called medical scoring systems), and (iii) interpretable case-based reasoning in deep neural networks for computer vision.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115408444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5