Proceedings of the Eighth ACM International Conference on Web Search and Data Mining最新文献_第7页

Learning to Recommend Related Entities to Search Users 学习向搜索用户推荐相关实体

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2685304

Bin Bi, Hao Ma, B. Hsu, Wei Chu, Kuansan Wang, Junghoo Cho

{"title":"Learning to Recommend Related Entities to Search Users","authors":"Bin Bi, Hao Ma, B. Hsu, Wei Chu, Kuansan Wang, Junghoo Cho","doi":"10.1145/2684822.2685304","DOIUrl":"https://doi.org/10.1145/2684822.2685304","url":null,"abstract":"Over the past few years, major web search engines have introduced knowledge bases to offer popular facts about people, places, and things on the entity pane next to regular search results. In addition to information about the entity searched by the user, the entity pane often provides a ranked list of related entities. To keep users engaged, it is important to develop a recommendation model that tailors the related entities to individual user interests. We propose a probabilistic Three-way Entity Model (TEM) that provides personalized recommendation of related entities using three data sources: knowledge base, search click log, and entity pane log. Specifically, TEM is capable of extracting hidden structures and capturing underlying correlations among users, main entities, and related entities. Moreover, the TEM model can also exploit the click signals derived from the entity pane log. We further provide an inference technique to learn the parameters in TEM, and propose a principled preference learning method specifically designed for ranking related entities. Extensive experiments with two real-world datasets show that TEM with our probabilistic framework significantly outperforms a state of the art baseline, confirming the effectiveness of TEM and our probabilistic framework in related entity recommendation.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121201294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 38

MergeRUCB: A Method for Large-Scale Online Ranker Evaluation MergeRUCB:一种大规模在线排名评估方法

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2685290

M. Zoghi, Shimon Whiteson, M. de Rijke

{"title":"MergeRUCB: A Method for Large-Scale Online Ranker Evaluation","authors":"M. Zoghi, Shimon Whiteson, M. de Rijke","doi":"10.1145/2684822.2685290","DOIUrl":"https://doi.org/10.1145/2684822.2685290","url":null,"abstract":"A key challenge in information retrieval is that of on-line ranker evaluation: determining which one of a finite set of rankers performs the best in expectation on the basis of user clicks on presented document lists. When the presented lists are constructed using interleaved comparison methods, which interleave lists proposed by two different candidate rankers, then the problem of minimizing the total regret accumulated while evaluating the rankers can be formalized as a K-armed dueling bandit problem. In the setting of web search, the number of rankers under consideration may be large. Scaling effectively in the presence of so many rankers is a key challenge not adequately addressed by existing algorithms. We propose a new method, which we call mergeRUCB, that uses \"localized\" comparisons to provide the first provably scalable K-armed dueling bandit algorithm. Empirical comparisons on several large learning to rank datasets show that mergeRUCB can substantially outperform the state of the art K-armed dueling bandit algorithms when many rankers must be compared. Moreover, we provide theoretical guarantees demonstrating the soundness of our algorithm.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121262301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 49

Session details: Keynote Address 3 会议详情:主题演讲

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/3251098

Jie Tang

引用次数: 0

The Information Life of Social Networks 社交网络的信息生活

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2685325

Lada A. Adamic

引用次数: 2

Automatic Gloss Finding for a Knowledge Base using Ontological Constraints 基于本体约束的知识库自动光泽查找

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2685288

Bhavana Dalvi, Einat Minkov, P. Talukdar, William W. Cohen

{"title":"Automatic Gloss Finding for a Knowledge Base using Ontological Constraints","authors":"Bhavana Dalvi, Einat Minkov, P. Talukdar, William W. Cohen","doi":"10.1145/2684822.2685288","DOIUrl":"https://doi.org/10.1145/2684822.2685288","url":null,"abstract":"While there has been much research on automatically constructing structured Knowledge Bases (KBs), most of it has focused on generating facts to populate a KB. However, a useful KB must go beyond facts. For example, glosses (short natural language definitions) have been found to be very useful in tasks such as Word Sense Disambiguation. However, the important problem of Automatic Gloss Finding, i.e., assigning glosses to entities in an initially gloss-free KB, is relatively unexplored. We address that gap in this paper. In particular, we propose GLOFIN, a hierarchical semi-supervised learning algorithm for this problem which makes effective use of limited amounts of supervision and available ontological constraints. To the best of our knowledge, GLOFIN is the first system for this task. Through extensive experiments on real-world datasets, we demonstrate GLOFIN's effectiveness. It is encouraging to see that GLOFIN outperforms other state-of-the-art SSL algorithms, especially in low supervision settings. We also demonstrate GLOFIN's robustness to noise through experiments on a wide variety of KBs, ranging from user contributed (e.g., Freebase) to automatically constructed (e.g., NELL). To facilitate further research in this area, we have made the datasets and code used in this paper publicly available.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131515910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35

Offline Evaluation and Optimization for Interactive Systems 交互式系统的离线评估与优化

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2697040

Lihong Li

{"title":"Offline Evaluation and Optimization for Interactive Systems","authors":"Lihong Li","doi":"10.1145/2684822.2697040","DOIUrl":"https://doi.org/10.1145/2684822.2697040","url":null,"abstract":"Evaluating and optimizing an interactive system (like search engines, recommender and advertising systems) from historical data against a predefined online metric is challenging, especially when that metric is computed from user feedback such as clicks and payments. The key challenge is counterfactual in nature: we only observe a user's feedback for actions taken by the system, but we do not know what that user would have reacted to a different action. The golden standard to evaluate such metrics of a user-interacting system is online A/B experiments (a.k.a. randomized controlled experiments), which can be expensive in terms of both time and engineering resources. Offline evaluation/optimization (sometimes referred to as off-policy learning in the literature) thus becomes critical, aiming to evaluate the same metrics without running (many) expensive A/B experiments on live users. One approach to offline evaluation is to build a user model that simulates user behavior (clicks, purchases, etc.) under various contexts, and then evaluate metrics of a system with this simulator. While being straightforward and common in practice, the reliability of such model-based approaches relies heavily on how well the user model is built. Furthermore, it is often difficult to know a priori whether a user model is good enough to be trustable. Recent years have seen a growing interest in another solution to the offline evaluation problem. Using statistical techniques like importance sampling and doubly robust estimation, the approach can give unbiased estimates of metrics for a wide range of problems. It enjoys other benefits as well. For example, it often allows data scientists to obtain a confidence interval for the estimate to quantify the amount of uncertainty; it does not require building user models, so is more robust and easier to apply. All these benefits make the approach particularly attractive to a wide range of problems. Successful applications have been reported in the last few years by some of the industrial leaders. This tutorial gives a review of the basic theory and representative techniques. Applications of these techniques are illustrated through several case studies done at Microsoft and Yahoo!.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128941602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Sentiment-Specific Representation Learning for Document-Level Sentiment Analysis 面向文档级情感分析的情感特定表示学习

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2697035

Duyu Tang

{"title":"Sentiment-Specific Representation Learning for Document-Level Sentiment Analysis","authors":"Duyu Tang","doi":"10.1145/2684822.2697035","DOIUrl":"https://doi.org/10.1145/2684822.2697035","url":null,"abstract":"In this paper, we propose a representation learning research framework for document-level sentiment analysis. Given a document as the input, document-level sentiment analysis aims to automatically classify its sentiment/opinion (such as thumbs up or thumbs down) based on the textural information. Despite the success of feature engineering in many previous studies, the hand-coded features do not well capture the semantics of texts. In this research, we argue that learning sentiment-specific semantic representations of documents is crucial for document-level sentiment analysis. We decompose the document semantics into four cascaded constitutes: (1) word representation, (2) sentence structure, (3) sentence composition and (4) document composition. Specifically, we learn sentiment-specific word representations, which simultaneously encode the contexts of words and the sentiment supervisions of texts into the continuous representation space. According to the principle of compositionality, we learn sentiment-specific sentence structures and sentence-level composition functions to produce the representation of each sentence based on the representations of the words it contains. The semantic representations of documents are obtained through document composition, which leverages the sentiment-sensitive discourse relations and sentence representations.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"47 43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132307422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 54

HIA'15: Heterogeneous Information Access Workshop at WSDM 2015 HIA'15:异构信息访问研讨会，在WSDM 2015

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2697029

K. Zhou, Roger Jie Luo, D. Hiemstra, J. Jose

引用次数: 0

On Tag Recommendation for Expertise Profiling: A Case Study in the Scientific Domain 专家分析的标签推荐:科学领域的案例研究

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2685320

Isac S. Ribeiro, Rodrygo L. T. Santos, Marcos André Gonçalves, Alberto H. F. Laender

{"title":"On Tag Recommendation for Expertise Profiling: A Case Study in the Scientific Domain","authors":"Isac S. Ribeiro, Rodrygo L. T. Santos, Marcos André Gonçalves, Alberto H. F. Laender","doi":"10.1145/2684822.2685320","DOIUrl":"https://doi.org/10.1145/2684822.2685320","url":null,"abstract":"Building expertise profiles is a crucial step towards identifying experts in different knowledge areas. However, summarizing the topics of expertise of a given individual is a challenging task, primarily due to the semi-structured and heterogeneous nature of the documentary evidence available for this task. In this paper, we investigate the suitability of tag recommendation as a mechanism to produce effective expertise profiles. In particular, we perform a large-scale user study with academic experts from different knowledge areas to assess the effectiveness of multiple supervised and unsupervised tag recommendation approaches as well as multiple sources of textual evidence. Our analysis reveals that traditional content-based tag recommenders perform well at identifying expertise-oriented tags, with article keywords being a particularly effective source of evidence across profiles in different knowledge areas and with various levels of sparsity. Moreover, by combining multiple recommenders and sources of evidence as learning signals, we further demonstrate the effectiveness of tag recommendation for expertise profiling.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125356490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Session details: Session 1: Practice & Experience Talk + Panel on Large Scale Data Understanding 会议详情:第一部分:实践与经验讲座+大规模数据理解小组讨论

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/3251091

Ying Li, A. Broder

引用次数: 0