Rishabh Mehrotra, Ahmed Hassan Awadallah, Emine Yilmaz
{"title":"LearnIR: WSDM 2018 Workshop on Learning from User Interactions","authors":"Rishabh Mehrotra, Ahmed Hassan Awadallah, Emine Yilmaz","doi":"10.1145/3159652.3160598","DOIUrl":"https://doi.org/10.1145/3159652.3160598","url":null,"abstract":"While users interact with online services(e.g. search engines, recommender systems, conversational agents), they leave behind fine grained traces of interaction patterns. The ability to understand user behavior, record and interpret user interaction signals, gauge user satisfaction and incorporate user feedback gives online systems a vast treasure trove of insights for improvement and experimentation. More generally, the ability to learn from user interactions promises pathways for solving a number of problems and improving user engagement and satisfaction. Understanding and learning from user interactions involves a number of different aspects - from understanding user intent and tasks, to developing user models and personalization services. A user's understanding of their need and the overall task develop as they interact with the system. Supporting the various stages of the task involves many aspects of the system, e.g. interface features, presentation of information, retrieving and ranking. Often, online systems are not specifically designed to support users in successfully accomplishing the tasks which motivated them to interact with the system in the first place. Beyond understanding user needs, learning from user interactions involves developing the right metrics and expiermentation systems, understanding user interaction processes, their usage context and designing interfaces capable of helping users. Learning from user interactions becomes more important as new and novel ways of user interactions surface. There is a gradual shift towards searching and presenting the information in a conversational form. Chatbots, personal assistants in our phones and eyes-free devices are being used increasingly more for different purposes, including information retrieval and exploration. With improved speech recognition and information retrieval systems, more and more users are increasingly relying on such digital assistants to fulfill their information needs and complete their tasks. Such systems rely heavily on quickly learning from past interactions and incorporating implicit feedback signals into their models for rapid development.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123942092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Percolator: Scalable Pattern Discovery in Dynamic Graphs","authors":"Sutanay Choudhury, Sumit Purohit, Peng Lin, Yinghui Wu, L. Holder, Khushbu Agarwal","doi":"10.1145/3159652.3160589","DOIUrl":"https://doi.org/10.1145/3159652.3160589","url":null,"abstract":"We demonstrate perco, a distributed system for graph pattern discovery in dynamic graphs. In contrast to conventional mining systems, Percolator advocates efficient pattern mining schemes that (1) support pattern detection with keywords; (2) integrate incremental and parallel pattern mining; and (3) support analytical queries such as trend analysis. The core idea of perco is to dynamically decide and verify a small fraction of patterns and their instances that must be inspected in response to buffered updates in dynamic graphs, with a total mining cost independent of graph size. We demonstrate a( the feasibility of incremental pattern mining by walking through each component of perco, b) the efficiency and scalability of perco over the sheer size of real-world dynamic graphs, and c) how the user-friendly gui of perco interacts with users to support keyword-based queries that detect, browse and inspect trending patterns. We demonstrate how perco effectively supports event and trend analysis in social media streams and research publication, respectively.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122172264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feida Zhu, Yongfeng Zhang, N. Yorke-Smith, G. Guo, Xu Chen
{"title":"IFUP: Workshop on Multi-dimensional Information Fusion for User Modeling and Personalization","authors":"Feida Zhu, Yongfeng Zhang, N. Yorke-Smith, G. Guo, Xu Chen","doi":"10.1145/3159652.3160592","DOIUrl":"https://doi.org/10.1145/3159652.3160592","url":null,"abstract":"Recommendation system has became an important component in many real applications, ranging from e-commerce, music app to video-sharing site and on-line book store. The key of a successful recommendation system lies in the accurate user/item profiling. With the advent of web 2.0, quite a lot of multimodal information has been accumulated, which provides us with the opportunity to profile users in a more comprehensive manner. However, directly integrating multimodal information into recommendation system is not a trivial task, because they may be either homogenous or heterogeneous, which requires more advanced method for both fusion and alignment. This workshop aims to provide a platform for discussing the challenges and corresponding innovative approaches in fusing multi-dimensional information for user modeling and recommender systems. We hope more advanced technologies can be proposed or inspired, and also we hope that the direction of integrating different types of information can catch much more attention in both academic and industry.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122656768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Power of Massive Text Data","authors":"Jiawei Han","doi":"10.1145/3159652.3160604","DOIUrl":"https://doi.org/10.1145/3159652.3160604","url":null,"abstract":"The real-world big data is largely unstructured, dynamic, and interconnected, in the form of natural language text. It is highly desirable to transform such massive unstructured data into structured knowledge. Many researchers and practitioners rely on labor-intensive labeling and curation to extract knowledge from unstructured text data. However, such approaches may not be scalable to web-scale or adaptable to new domains, especially considering that a lot of text corpora are highly dynamic and domain-specific. We argue that massive text data itself contains a large body of hidden patterns, structures, and knowledge. Equipped with domain-independent and domain-specific knowledge-bases, a promising direction is to develop more systematic data mining methods to turn massive unstructured text data into structured knowledge. We introduce a set of methods developed recently in our own group on exploration of the power of big text data, including mining quality phrases using unsupervised, weakly supervised and distantly supervised approaches, recognition and typing of entities and relations by distant supervision, meta-pattern-based entity-attribute-value extraction, set expansion and local embedding-based multi-faceted taxonomy discovery, allocation of text documents into multi-dimensional text cubes, construction of heterogeneous information networks from text cube, and eventually mining multi-dimensional structured knowledge from massive text data. We show that massive text data itself can be powerful at disclosing patterns, structures and hidden knowledge, and it is promising to explore the power of massive, interrelated text data for transforming such unstructured data into structured knowledge.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124982223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, Marc Najork
{"title":"Position Bias Estimation for Unbiased Learning to Rank in Personal Search","authors":"Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, Marc Najork","doi":"10.1145/3159652.3159732","DOIUrl":"https://doi.org/10.1145/3159652.3159732","url":null,"abstract":"A well-known challenge in learning from click data is its inherent bias and most notably position bias. Traditional click models aim to extract the ‹query, document› relevance and the estimated bias is usually discarded after relevance is extracted. In contrast, the most recent work on unbiased learning-to-rank can effectively leverage the bias and thus focuses on estimating bias rather than relevance [20, 31]. Existing approaches use search result randomization over a small percentage of production traffic to estimate the position bias. This is not desired because result randomization can negatively impact users' search experience. In this paper, we compare different schemes for result randomization (i.e., RandTopN and RandPair) and show their negative effect in personal search. Then we study how to infer such bias from regular click data without relying on randomization. We propose a regression-based Expectation-Maximization (EM) algorithm that is based on a position bias click model and that can handle highly sparse clicks in personal search. We evaluate our EM algorithm and the extracted bias in the learning-to-rank setting. Our results show that it is promising to extract position bias from regular clicks without result randomization. The extracted bias can improve the learning-to-rank algorithms significantly. In addition, we compare the pointwise and pairwise learning-to-rank models. Our results show that pairwise models are more effective in leveraging the estimated bias.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125040301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Streaming Link Prediction on Dynamic Attributed Networks","authors":"Jundong Li, Kewei Cheng, Liang Wu, Huan Liu","doi":"10.1145/3159652.3159674","DOIUrl":"https://doi.org/10.1145/3159652.3159674","url":null,"abstract":"Link prediction targets to predict the future node interactions mainly based on the current network snapshot. It is a key step in understanding the formation and evolution of the underlying networks; and has practical implications in many real-world applications, ranging from friendship recommendation, click through prediction to targeted advertising. Most existing efforts are devoted to plain networks and assume the availability of network structure in memory before link prediction takes place. However, this assumption is untenable as many real-world networks are affiliated with rich node attributes, and often, the network structure and node attributes are both dynamically evolving at an unprecedented rate. Even though recent studies show that node attributes have an added value to network structure for accurate link prediction, it still remains a daunting task to support link prediction in an online fashion on such dynamic attributed networks. As changes in the dynamic attributed networks are often transient and can be endless, link prediction algorithms need to be efficient by making only one pass of the data with limited memory overhead. To tackle these challenges, we study a novel problem of streaming link prediction on dynamic attributed networks and present a novel framework - SLIDE. Methodologically, SLIDE maintains and updates a low-rank sketching matrix to summarize all observed data, and we further leverage the sketching matrix to infer missing links on the fly. The whole procedure is theoretically guaranteed, and empirical experiments on real-world dynamic attributed networks validate the effectiveness and efficiency of the proposed framework.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"199 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129463740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Web Search of Fashion Items with Multimodal Querying","authors":"Katrien Laenen, Susana Zoghbi, Marie-Francine Moens","doi":"10.1145/3159652.3159716","DOIUrl":"https://doi.org/10.1145/3159652.3159716","url":null,"abstract":"In this paper, we introduce a novel multimodal fashion search paradigm where e-commerce data is searched with a multimodal query composed of both an image and text. In this setting, the query image shows a fashion product that the user likes and the query text allows to change certain product attributes to fit the product to the user's desire. Multimodal search gives users the means to clearly express what they are looking for. This is in contrast to current e-commerce search mechanisms, which are cumbersome and often fail to grasp the customer's needs. Multimodal search requires intermodal representations of visual and textual fashion attributes which can be mixed and matched to form the user's desired product, and which have a mechanism to indicate when a visual and textual fashion attribute represent the same concept. With a neural network, we induce a common, multimodal space for visual and textual fashion attributes where their inner product measures their semantic similarity. We build a multimodal retrieval model which operates on the obtained intermodal representations and which ranks images based on their relevance to a multimodal query. We demonstrate that our model is able to retrieve images that both exhibit the necessary query image attributes and satisfy the query texts. Moreover, we show that our model substantially outperforms two state-of-the-art retrieval models adapted to multimodal fashion search.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133789217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Collaborative Filtering via Additive Ordinal Regression","authors":"Jun Hu, Ping Li","doi":"10.1145/3159652.3159723","DOIUrl":"https://doi.org/10.1145/3159652.3159723","url":null,"abstract":"Accurately predicting user preferences/ratings over items are crucial for many Internet applications, e.g., recommender systems, online advertising. In current main-stream algorithms regarding the rating prediction problem, discrete rating scores are often viewed as either numerical values or(nominal) categorical labels. Practically, viewing user rating scores as numerical values or categorical labels cannot precisely reflect the exact degree of user preferences. It is expected that for each user, the quantitative distance/scale between any pair of adjacent rating scores could be different. In this paper, we propose a new ordinal regression approach. We view ordered preference scores in an additive way, where we are able to model users» internal rating patterns. Specifically, we model and learn the quantitative distances/scales between any pair of adjacent rating scores. In this way, we can generate a mapping from users» assigned discrete rating scores to the exact magnitude/degree of user preferences for items. In the application of rating prediction, we combine our newly proposed ordinal regression method with matrix factorization, forming a new ordinal matrix factorization method. Through extensive experiments on benchmark datasets, we show that our method significantly outperforms existing ordinal methods, as well as other popular collaborative filtering methods in terms of the rating prediction accuracy.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133080150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Consistent Transformation of Ratio Metrics for Efficient Online Controlled Experiments","authors":"R. Budylin, Alexey Drutsa, I. Katsev, V. Tsoy","doi":"10.1145/3159652.3159699","DOIUrl":"https://doi.org/10.1145/3159652.3159699","url":null,"abstract":"We study ratio overall evaluation criteria (user behavior quality metrics) and, in particular, average values of non-user level metrics, that are widely used in A/B testing as an important part of modern Internet companies» evaluation instruments (e.g., abandonment rate, a user»s absence time after a session). We focus on the problem of sensitivity improvement of these criteria, since there is a large gap between the variety of sensitivity improvement techniques designed for user level metrics and the variety of such techniques for ratio criteria. We propose a novel transformation of a ratio criterion to the average value of a user level (randomization-unit level, in general) metric that creates an opportunity to directly use a wide range of sensitivity improvement techniques designed for the user level that make A/B tests more efficient. We provide theoretical guarantees on the novel metric»s consistency in terms of preservation of two crucial properties (directionality and significance level) w.r.t. the source ratio criteria. The experimental evaluation of the approach is done on hundreds large-scale real A/B tests run at one of the most popular global search engines, reinforces the theoretical results, and demonstrates up to $+34%$ of sensitivity rate improvement achieved by the transformation combined with the best known regression adjustment.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133089646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Micro Behaviors: A New Perspective in E-commerce Recommender Systems","authors":"Meizi Zhou, Zhuoye Ding, Jiliang Tang, Dawei Yin","doi":"10.1145/3159652.3159671","DOIUrl":"https://doi.org/10.1145/3159652.3159671","url":null,"abstract":"The explosive popularity of e-commerce sites has reshaped users» shopping habits and an increasing number of users prefer to spend more time shopping online. This evolution allows e-commerce sites to observe rich data about users. The majority of traditional recommender systems have focused on the macro interactions between users and items, i.e., the purchase history of a customer. However, within each macro interaction between a user and an item, the user actually performs a sequence of micro behaviors, which indicate how the user locates the item, what activities the user conducts on the item (e.g., reading the comments, carting, and ordering) and how long the user stays with the item. Such micro behaviors offer fine-grained and deep understandings about users and provide tremendous opportunities to advance recommender systems in e-commerce. However, exploiting micro behaviors for recommendations is rather limited, which motivates us to investigate e-commerce recommendations from a micro-behavior perspective in this paper. Particularly, we uncover the effects of micro behaviors on recommendations and propose an interpretable Recommendation framework RIB, which models inherently the sequence of mIcro Behaviors and their effects. Experimental results on datasets from a real e-commence site demonstrate the effectiveness of the proposed framework and the importance of micro behaviors for recommendations.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115149329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}