{"title":"How is Attention Allocated?: Data-Driven Studies of Popularity and Engagement in Online Videos","authors":"Siqi Wu","doi":"10.1145/3289600.3291599","DOIUrl":"https://doi.org/10.1145/3289600.3291599","url":null,"abstract":"The share of videos on Internet traffic has been growing, e.g., people are now spending a billion hours watching YouTube videos every day. Therefore, understanding how videos capture attention on a global scale is also of growing importance for both research and practice. In online platforms, people can interact with videos in different ways -- there are behaviors of active participation (watching, commenting, and sharing) and that of passive consumption (viewing). In this paper, we take a data-driven approach to studying how human attention is allocated in online videos with respect to both active and passive behaviors. We first investigate the active interaction behaviors by proposing a novel metric to represent the aggregate user engagement on YouTube videos. We show this metric is correlated with video quality, stable over lifetime, and predictable before video's upload. Next, we extend the line of work on modelling video view counts by disentangling the effects of two dominant traffic sources -- related videos and YouTube search. Findings from this work can help content producers to create engaging videos and hosting platforms to optimize advertising strategies, recommender systems, and many more applications.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130721533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yating Zhang, A. Jatowt, S. Bhowmick, Yuji Matsumoto
{"title":"ATAR: Aspect-Based Temporal Analog Retrieval System for Document Archives","authors":"Yating Zhang, A. Jatowt, S. Bhowmick, Yuji Matsumoto","doi":"10.1145/3289600.3290613","DOIUrl":"https://doi.org/10.1145/3289600.3290613","url":null,"abstract":"In recent years, we have witnessed a rapid increase of text content stored in digital archives such as newspaper archives or web archives. With the passage of time, it is however difficult to effectively perform search within such collections due to vocabulary and context change. In this paper, we present a system that helps to find analogical terms across temporal text collections by applying non-linear transformation. We implement two approaches for analog retrieval where one of them allows users to also input an aspect term specifying particular perspective of a query. The current prototype system permits temporal analog search across two different time periods based on New York Times Annotated Corpus.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131316862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 8: Counterfactual and Causal Learning","authors":"Emre Kıcıman","doi":"10.1145/3310348","DOIUrl":"https://doi.org/10.1145/3310348","url":null,"abstract":"","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121970136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Causal Inference and Counterfactual Reasoning (3hr Tutorial)","authors":"Emre Kıcıman, Amit Sharma","doi":"10.1145/3289600.3291381","DOIUrl":"https://doi.org/10.1145/3289600.3291381","url":null,"abstract":"As computing systems are more frequently and more actively intervening to improve people's work and daily lives, it is critical to correctly predict and understand the causal effects of these interventions. Conventional machine learning methods, built on pattern recognition and correlational analyses, are insufficient for causal analysis. This tutorial will introduce participants to concepts in causal inference and counterfactual reasoning, drawing from a broad literature from statistics, social sciences and machine learning. We will first motivate the use of causal inference through examples in domains such as recommender systems, social media datasets, health, education and governance. To tackle such questions, we will introduce the key ingredient that causal analysis depends on---counterfactual reasoning---and describe the two most popular frameworks based on Bayesian graphical models and potential outcomes. Based on this, we will cover a range of methods suitable for doing causal inference with large-scale online data, including randomized experiments, observational methods like matching and stratification, and natural experiment-based methods such as instrumental variables and regression discontinuity. We will also focus on best practices for evaluation and validation of causal inference techniques, drawing from our own experiences. After attending this tutorial, participants will understand the basics of causal inference, be able to appropriately apply the most common causal inference methods, and be able to recognize situations where more complex methods are required.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129748205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Listwise vs Pagewise: Towards Better Ranking Strategies for Heterogeneous Search Results","authors":"Junqi Zhang","doi":"10.1145/3289600.3291596","DOIUrl":"https://doi.org/10.1145/3289600.3291596","url":null,"abstract":"As heterogeneous verticals account for more and more in search engines, users' preference of search results is largely affected by their presentations. Apart from texts, multimedia information such as images and videos has been widely adopted as it makes the search engine result pages (SERPs) more informative and attractive. It is more proper to regard the SERP as an information union, not separate search results because they interact with each other. Considering these changes in search engines, we plan to better exploit the contents of search results displayed on SERPs through deep neural networks and formulate the pagewise optimization of SERPs as a reinforcement learning problem.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"2004 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128290587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
X. Wu, Baoxu Shi, Yuxiao Dong, Chao Huang, N. Chawla
{"title":"Neural Tensor Factorization for Temporal Interaction Learning","authors":"X. Wu, Baoxu Shi, Yuxiao Dong, Chao Huang, N. Chawla","doi":"10.1145/3289600.3290998","DOIUrl":"https://doi.org/10.1145/3289600.3290998","url":null,"abstract":"Neural collaborative filtering (NCF) and recurrent recommender systems (RRN) have been successful in modeling relational data (user-item interactions). However, they are also limited in their assumption of static or sequential modeling of relational data as they do not account for evolving users' preference over time as well as changes in the underlying factors that drive the change in user-item relationship over time. We address these limitations by proposing a Neural network based Tensor Factorization (NTF) model for predictive tasks on dynamic relational data. The NTF model generalizes conventional tensor factorization from two perspectives: First, it leverages the long short-term memory architecture to characterize the multi-dimensional temporal interactions on relational data. Second, it incorporates the multi-layer perceptron structure for learning the non-linearities between different latent factors. Our extensive experiments demonstrate the significant improvement in both the rating prediction and link prediction tasks on various dynamic relational data by our NTF model over both neural network based factorization models and other traditional methods.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128690894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianfeng Qu, D. Ouyang, Wen Hua, Yuxin Ye, Xiaofang Zhou
{"title":"Discovering Correlations between Sparse Features in Distant Supervision for Relation Extraction","authors":"Jianfeng Qu, D. Ouyang, Wen Hua, Yuxin Ye, Xiaofang Zhou","doi":"10.1145/3289600.3291004","DOIUrl":"https://doi.org/10.1145/3289600.3291004","url":null,"abstract":"The recent art in relation extraction is distant supervision which generates training data by heuristically aligning a knowledge base with free texts and thus avoids human labelling. However, the concerned relation mentions often use the bag-of-words representation, which ignores inner correlations between features located in different dimensions and makes relation extraction less effective. To capture the complex characteristics of relation expression and tighten the correlated features, we attempt to discover and utilise informative correlations between features by the following four phases: 1) formulating semantic similarities between lexical features using the embedding method; 2) constructing generative relation for lexical features with different sizes of side windows; 3) computing correlation scores between syntactic features through a kernel-based method; and 4) conducting a distillation process for the obtained correlated feature pairs and integrating informative pairs with existing relation extraction models. The extensive experiments demonstrate that our method can effectively discover correlation information and improve the performance of state-of-the-art relation extraction methods.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130118145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Preference Elicitation Strategy for Conversational Recommender System","authors":"B. Priyogi","doi":"10.1145/3289600.3291604","DOIUrl":"https://doi.org/10.1145/3289600.3291604","url":null,"abstract":"Traditionally, recommenders have been based on a single-shot model based on past user actions. Conversational recommenders allow incremental elicitation of user preference by performing user-system dialogue. For example, the systems can ask about user preference toward a feature associated with the items. In such systems, it is important to design an efficient conversation, which minimizes the number of question asked while maximizing the preference information obtained. Therefore, this research is intended to explore possible ways to design a conversational recommender with an efficient preference elicitation. Specifically, it focuses on the order of questions. Also, an idea proposed to suggest answers for each question asked, which can assist users in giving their feedback.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"169 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132464318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Federated Online Learning to Rank with Evolution Strategies","authors":"E. Kharitonov","doi":"10.1145/3289600.3290968","DOIUrl":"https://doi.org/10.1145/3289600.3290968","url":null,"abstract":"Online Learning to Rank is a powerful paradigm that allows to train ranking models using only online feedback from its users.In this work, we consider Federated Online Learning to Rank setup (FOLtR) where on-mobile ranking models are trained in a way that respects the users' privacy. We require that the user data, such as queries, results, and their feature representations are never communicated for the purpose of the ranker's training. We believe this setup is interesting, as it combines unique requirements for the learning algorithm: (a) preserving the user privacy, (b) low communication and computation costs, (c) learning from noisy bandit feedback, and (d) learning with non-continuous ranking quality measures. We propose a learning algorithm FOLtR-ES that satisfies these requirements. A part of FOLtR-ES is a privatization procedure that allows it to provide ε-local differential privacy guarantees, i.e. protecting the clients from an adversary who has access to the communicated messages. This procedure can be applied to any absolute online metric that takes finitely many values or can be discretized to a finite domain. Our experimental study is based on a widely used click simulation approach and publicly available learning to rank datasets MQ2007 and MQ2008. We evaluate FOLtR-ES against offline baselines that are trained using relevance labels, linear regression model and RankingSVM. From our experiments, we observe that FOLtR-ES can optimize a ranking model to perform similarly to the baselines in terms of the optimized online metric, Max Reciprocal Rank.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129180469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shaping Feedback Data in Recommender Systems with Interventions Based on Information Foraging Theory","authors":"Tobias Schnabel, Paul N. Bennett, T. Joachims","doi":"10.1145/3289600.3290974","DOIUrl":"https://doi.org/10.1145/3289600.3290974","url":null,"abstract":"Recommender systems rely heavily on the predictive accuracy of the learning algorithm. Most work on improving accuracy has focused on the learning algorithm itself. We argue that this algorithmic focus is myopic. In particular, since learning algorithms generally improve with more and better data, we propose shaping the feedback generation process as an alternate and complementary route to improving accuracy. To this effect, we explore how changes to the user interface can impact the quality and quantity of feedback data -- and therefore the learning accuracy. Motivated by information foraging theory, we study how feedback quality and quantity are influenced by interface design choices along two axes: information scent and information access cost. We present a user study of these interface factors for the common task of picking a movie to watch, showing that these factors can effectively shape and improve the implicit feedback data that is generated while maintaining the user experience.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122622627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}