Tara Safavi, Adam Fourney, Robert B Sim, Marcin Juraszek, Shane Williams, Ned Friend, Danai Koutra, Paul N. Bennett
{"title":"Toward Activity Discovery in the Personal Web","authors":"Tara Safavi, Adam Fourney, Robert B Sim, Marcin Juraszek, Shane Williams, Ned Friend, Danai Koutra, Paul N. Bennett","doi":"10.1145/3336191.3371828","DOIUrl":"https://doi.org/10.1145/3336191.3371828","url":null,"abstract":"Individuals' personal information collections (their emails, files, appointments, web searches, contacts, etc) offer a wealth of insights into the organization and structure of their everyday lives. In this paper we address the task of learning representations of personal information items to capture individuals' ongoing activities, such as projects and tasks: Such representations can be used in activity-centric applications like personal assistants, email clients, and productivity tools to help people better manage their data and time. We propose a graph-based approach that leverages the inherent interconnected structure of personal information collections, and derive efficient, exact techniques to incrementally update representations as new data arrive. We demonstrate the strengths of our graph-based representations against competitive baselines in a novel intrinsic rating task and an extrinsic recommendation task.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123121542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zeyu Li, Wei Cheng, Yang Chen, Haifeng Chen, Wei Wang
{"title":"Interpretable Click-Through Rate Prediction through Hierarchical Attention","authors":"Zeyu Li, Wei Cheng, Yang Chen, Haifeng Chen, Wei Wang","doi":"10.1145/3336191.3371785","DOIUrl":"https://doi.org/10.1145/3336191.3371785","url":null,"abstract":"Click-through rate (CTR) prediction is a critical task in online advertising and marketing. For this problem, existing approaches, with shallow or deep architectures, have three major drawbacks. First, they typically lack persuasive rationales to explain the outcomes of the models. Unexplainable predictions and recommendations may be difficult to validate and thus unreliable and untrustworthy. In many applications, inappropriate suggestions may even bring severe consequences. Second, existing approaches have poor efficiency in analyzing high-order feature interactions. Third, the polysemy of feature interactions in different semantic subspaces is largely ignored. In this paper, we propose InterHAt that employs a Transformer with multi-head self-attention for feature learning. On top of that, hierarchical attention layers are utilized for predicting CTR while simultaneously providing interpretable insights of the prediction results. InterHAt captures high-order feature interactions by an efficient attentional aggregation strategy with low computational complexity. Extensive experiments on four public real datasets and one synthetic dataset demonstrate the effectiveness and efficiency of InterHAt.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126044474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Debiasing Word Embeddings from Sentiment Associations in Names","authors":"C. Hube, Maximilian Idahl, B. Fetahu","doi":"10.1145/3336191.3371779","DOIUrl":"https://doi.org/10.1145/3336191.3371779","url":null,"abstract":"Word embeddings, trained through models like skip-gram, have shown to be prone to capturing the biases from the training corpus, e.g. gender bias. Such biases are unwanted as they spill in downstream tasks, thus, leading to discriminatory behavior. In this work, we address the problem of prior sentiment associated with names in word embeddings where for a given name representation (e.g. \"Smith\"), a sentiment classifier will categorize it as either positive or negative. We propose DebiasEmb, a skip-gram based word embedding approach that, for a given oracle sentiment classification model, will debias the name representations, such that they cannot be associated with either positive or negative sentiment. Evaluation on standard word embedding benchmarks and a downstream analysis show that our approach is able to maintain a high quality of embeddings and at the same time mitigate sentiment bias in name embeddings.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124726550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parameter Tuning in Personal Search Systems","authors":"S. Chen, Xuanhui Wang, Zhen Qin, Donald Metzler","doi":"10.1145/3336191.3371820","DOIUrl":"https://doi.org/10.1145/3336191.3371820","url":null,"abstract":"Retrieval effectiveness in information retrieval systems is heavily dependent on how various parameters are tuned. One option to find these parameters is to run multiple online experiments and using a parameter sweep approach in order to optimize the search system. There are multiple downsides of this approach, mainly that it may lead to a poor experience for users. Another option is to do offline evaluation, which can act as a safeguard against potential quality issues. Offline evaluation requires a validation set of data that can be benchmarked against different parameter settings. However, for search over personal corpora, e.g. email and file search, it is impractical and often impossible to get a complete representative validation set, due to the inability to save raw queries and document information. In this work, we show how to do offline parameter tuning with only a partial validation set. In addition, we demonstrate how to do parameter tuning in the cases when we have complete knowledge of the internal implementation of the search system (white-box tuning), as well as the case where we have only partial knowledge (grey-box tuning). This has allowed us to do offline parameter tuning in a privacy-sensitive manner.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128874069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adversarial Machine Learning in Recommender Systems (AML-RecSys)","authors":"Yashar Deldjoo, T. D. Noia, Felice Antonio Merra","doi":"10.1145/3336191.3371877","DOIUrl":"https://doi.org/10.1145/3336191.3371877","url":null,"abstract":"Recommender systems (RS) are an integral part of many online services aiming to provide an enhanced user-oriented experience. Machine learning (ML) models are nowadays broadly adopted in modern state-of-the-art approaches to recommendation, which are typically trained to maximize a user-centred utility (e.g., user satisfaction) or a business-oriented one (e.g., profitability or sales increase). They work under the main assumption that users' historical feedback can serve as proper ground-truth for model training and evaluation. However, driven by the success in the ML community, recent advances show that state-of-the-art recommendation approaches such as matrix factorization (MF) models or the ones based on deep neural networks can be vulnerable to adversarial perturbations applied on the input data. These adversarial samples can impede the ability for training high-quality MF models and can put the driven success of these approaches at high risk. As a result, there is a new paradigm of secure training for RS that takes into account the presence of adversarial samples into the recommendation process. We present adversarial machine learning in Recommender Systems (AML-RecSys), which concerns the study of effective ML techniques in RS to fight against an adversarial component. AML-RecSys has been proposed in two main fashions within the RS literature: (i) adversarial regularization, which attempts to combat against adversarial perturbation added to input data or model parameters of a RS and, (ii) generative adversarial network (GAN)-based models, which adopt a generative process to train powerful ML models. We discuss a theoretical framework to unify the two above models, which is performed via a minimax game between an adversarial component and a discriminator. Furthermore, we explore various examples illustrating the successful application of AML to solve various RS tasks. Finally, we present a global taxonomy/overview of the academic literature based on several identified dimensions, namely (i) research goals and challenges, (ii) application domains and (iii) technical overview.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128934142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feng Liu, Huifeng Guo, Xutao Li, Ruiming Tang, Yunming Ye, Xiuqiang He
{"title":"End-to-End Deep Reinforcement Learning based Recommendation with Supervised Embedding","authors":"Feng Liu, Huifeng Guo, Xutao Li, Ruiming Tang, Yunming Ye, Xiuqiang He","doi":"10.1145/3336191.3371858","DOIUrl":"https://doi.org/10.1145/3336191.3371858","url":null,"abstract":"The research of reinforcement learning (RL) based recommendation method has become a hot topic in recommendation community, due to the recent advance in interactive recommender systems. The existing RL recommendation approaches can be summarized into a unified framework with three components, namely embedding component (EC), state representation component (SRC) and policy component (PC). We find that EC cannot be nicely trained with the other two components simultaneously. Previous studies bypass the obstacle through a pre-training and fixing strategy, which makes their approaches unlike a real end-to-end fashion. More importantly, such pre-trained and fixed EC suffers from two inherent drawbacks: (1) Pre-trained and fixed embeddings are unable to model evolving preference of users and item correlations in the dynamic environment; (2) Pre-training is inconvenient in the industrial applications. To address the problem, in this paper, we propose an End-to-end Deep Reinforcement learning based Recommendation framework (EDRR). In this framework, a supervised learning signal is carefully designed for smoothing the update gradients to EC, and three incorporating ways are introduced and compared. To the best of our knowledge, we are the first to address the training compatibility between the three components in RL based recommendations. Extensive experiments are conducted on three real-world datasets, and the results demonstrate the proposed EDRR effectively achieves the end-to-end training purpose for both policy-based and value-based RL models, and delivers better performance than state-of-the-art methods.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124216403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Natalia Silberstein, O. Somekh, Yair Koren, M. Aharon, Dror Porat, Avi Shahar, Tingyi Wu
{"title":"Ad Close Mitigation for Improved User Experience in Native Advertisements","authors":"Natalia Silberstein, O. Somekh, Yair Koren, M. Aharon, Dror Porat, Avi Shahar, Tingyi Wu","doi":"10.1145/3336191.3371798","DOIUrl":"https://doi.org/10.1145/3336191.3371798","url":null,"abstract":"Verizon Media native advertising (also known as Yahoo Gemini native) serves billions of ad impressions daily, reaching several hundreds of millions USD in revenue yearly. Although we strive to provide the best experience for our users, there will always be some users that dislike our ads in certain cases. To address these situations Gemini native platform provides an ad close mechanism that enables users to close ads that they dislike and also to provide a reasoning for their action. Surprisingly, users do care about their ad experience and their engagement with the ad close mechanism is quite significant. While the ad close rate (ACR) is lower than the click through rate (CTR), they are of the same order of magnitude, especially on Yahoo mail properties. Since ad close events indicate bad user experience caused mostly by poor ad quality, we would like to exploit the ad close signals to improve user experience and reduce the number of ad close events while maintaining a predefined total revenue loss. In this work we present our ad close mitigation (ACM) solution that penalizes ads with high closing likelihood, in our auctions. In particular, we use the ad close signal and other available features to predict the probability of an ad close event, and calculate the expected loss due to such event for using the true expected revenue in the auction. We show that this approach fundamentally changes the generalized second price (GSP) auction and provides incentive for advertisers to improve their ads' quality. Our solution was tested in both offline and large scale online settings, serving real Gemini native traffic. Results of the online experiment show that we are able to reduce the number of ad close events by more than 20%, while decreasing the revenue in less than 0.4%. In addition, we present a large scale analysis of the ad close signal that supports various design decisions and sheds light on ways the ad close mechanism affects different crowds.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131009432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Impact of Online Job Search and Job Reviews on Job Decision","authors":"Faiz Ahamad","doi":"10.1145/3336191.3372184","DOIUrl":"https://doi.org/10.1145/3336191.3372184","url":null,"abstract":"Online platforms such as LinkedIn or specialized platforms such as Glassdoor are widely used by job seekers before applying for the job. These web platforms have rating and reviews about employer and jobs. Hence a job seeker do online search for the employer, before applying for the job. They try to find if the employer and job is good for them or not, what are the pros and cons of working there etc. Therefore, these reviews and ratings have an impact on job seekers decision as it portrays the pros and cons of working in a particular firm. Hence, the main objective of this study is main objective of this study is to find how the job seekers search for online employer reviews and the impact of these reviews on employer attractiveness and job pursuit intention. The other objective is to find the most crucial job factors that are given priority by the employee. For this, the study is proposed to be conducted in two stages, first, collecting data from the website Glassdoor, having 600000 companies' reviews. In the second stage, conducting an experimental study to examine the influence of job attributes (high vs. low) and employer rating (high vs. low) on job choice and employer attractiveness.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128397712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast Item Ranking under Neural Network based Measures","authors":"Shulong Tan, Zhixin Zhou, Zhao-Ying Xu, Ping Li","doi":"10.1145/3336191.3371830","DOIUrl":"https://doi.org/10.1145/3336191.3371830","url":null,"abstract":"Recently, plenty of neural network based recommendation models have demonstrated their strength in modeling complicated relationships between heterogeneous objects (i.e., users and items). However, the applications of these fine trained recommendation models are limited to the off-line manner or the re-ranking procedure (on a pre-filtered small subset of items), due to their time-consuming computations. Fast item ranking under learned neural network based ranking measures is largely still an open question. In this paper, we formulate ranking under neural network based measures as a generic ranking task, Optimal Binary Function Search (OBFS), which does not make strong assumptions for the ranking measures. We first analyze limitations of existing fast ranking methods (e.g., ANN search) and explain why they are not applicable for OBFS. Further, we propose a flexible graph-based solution for it, Binary Function Search on Graph (BFSG). It can achieve approximate optimal efficiently, with accessible conditions. Experiments demonstrate effectiveness and efficiency of the proposed method, in fast item ranking under typical neural network based measures.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127942032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Time Interval Aware Self-Attention for Sequential Recommendation","authors":"Jiacheng Li, Yujie Wang, Julian McAuley","doi":"10.1145/3336191.3371786","DOIUrl":"https://doi.org/10.1145/3336191.3371786","url":null,"abstract":"Sequential recommender systems seek to exploit the order of users' interactions, in order to predict their next action based on the context of what they have done recently. Traditionally, Markov Chains(MCs), and more recently Recurrent Neural Networks (RNNs) and Self Attention (SA) have proliferated due to their ability to capture the dynamics of sequential patterns. However a simplifying assumption made by most of these models is to regard interaction histories as ordered sequences, without regard for the time intervals between each interaction (i.e., they model the time-order but not the actual timestamp). In this paper, we seek to explicitly model the timestamps of interactions within a sequential modeling framework to explore the influence of different time intervals on next item prediction. We propose TiSASRec (Time Interval aware Self-attention based sequential recommendation), which models both the absolute positions of items as well as the time intervals between them in a sequence. Extensive empirical studies show the features of TiSASRec under different settings and compare the performance of self-attention with different positional encodings. Furthermore, experimental results show that our method outperforms various state-of-the-art sequential models on both sparse and dense datasets and different evaluation metrics.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128274823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}