Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining最新文献

Improving Session Search by Modeling Multi-Granularity Historical Query Change 通过建模多粒度历史查询变化改进会话搜索

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining Pub Date : 2022-02-11 DOI: 10.1145/3488560.3498415

Xiaochen Zuo, Zhicheng Dou, Ji-rong Wen

{"title":"Improving Session Search by Modeling Multi-Granularity Historical Query Change","authors":"Xiaochen Zuo, Zhicheng Dou, Ji-rong Wen","doi":"10.1145/3488560.3498415","DOIUrl":"https://doi.org/10.1145/3488560.3498415","url":null,"abstract":"In session search, it's important to utilize historical interactions between users and the search engines to improve document retrieval. However, not all historical information contributes to document ranking. Users often express their preferences in the process of modifying the previous query, which can help us catch useful information in the historical interactions. Inspired by it, we propose to model historical query change to improve document ranking performance. Especially, we characterize multi-granularity query change between each pair of adjacent queries at both term level and semantic level. For term level query change, we calculate three types of term weights, including the retained term weights, added term weights and removed term weights. Then we perform term-based interaction between the candidate document and historical queries based on the term weights. For semantic level query change, we calculate an overall representation of user intent by integrating the representations of each historical query obtained by different types of term weights. Then we adopt representation-based matching between this representation and the candidate document. To improve the effect of query change modeling, we introduce query change classification as an auxiliary task. Experimental results on AOL and TianGong-ST search logs show that our model outperforms most existing models for session search.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114986432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Exploration in Recommender Systems 推荐系统的探索

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining Pub Date : 2022-02-11 DOI: 10.1145/3488560.3510009

Minmin Chen

{"title":"Exploration in Recommender Systems","authors":"Minmin Chen","doi":"10.1145/3488560.3510009","DOIUrl":"https://doi.org/10.1145/3488560.3510009","url":null,"abstract":"In the era of increasing choices, recommender systems are becoming indispensable in helping users navigate the million or billion pieces of content on recommendation platforms. Most of the recommender systems are powered by ML models trained on a large amount of user-item interaction data. Such a setup however induces a strong feedback loop that creates the rich gets richer phenomenon where head contents are getting more and more exposure while tail and fresh contents are not discovered. At the same time, it pigeonholes users to contents they are already familiar with. We believe exploration is key to break away from the feedback loop and to optimize long term user experience on recommendation platforms. The exploration-exploitation tradeoff, being the foundation of bandits and RL research, has been extensively studied in RL. While effective exploration is believed to positively influence the user experience on the platform, the exact value of exploration in recommender systems has not been well established. In this talk, we examine the roles of exploration in recommender systems in three facets: 1) system exploration to surface fresh/tail recommendations based on users' known interests; 2) user exploration to identify unknown user interests or introduce users to new interests; and 3) online exploration to utilize real-time user feedback to reduce extrapolation errors in performing system and user exploration. We discuss the challenges in measurements and optimization in different types of exploration, and propose initial solutions. We showcase how each aspect of exploration contributes to the long term user experience through offline and live experiments on industrial recommendation platforms. We hope this talk can inspire more follow up work in understanding and improving exploration in recommender systems.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125940935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Quality Assurance of a German COVID-19 Question Answering Systems using Component-based Microbenchmarking 基于组件的微基准测试的德国COVID-19问答系统质量保证

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining Pub Date : 2022-02-11 DOI: 10.1145/3488560.3502196

A. Both, Paul Heinze, A. Perevalov, Johannes Richard Bartsch, Rostislav Iudin, Johannes Rudolf Herkner, Tim Schrader, Jonas Wunsch, René Gürth, Ann Kristin Falkenhain

引用次数: 1

Modern Theoretical Tools for Understanding and Designing Next-generation Information Retrieval System 理解和设计下一代信息检索系统的现代理论工具

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining Pub Date : 2022-02-11 DOI: 10.1145/3488560.3501394

Da Xu, Chuanwei Ruan

{"title":"Modern Theoretical Tools for Understanding and Designing Next-generation Information Retrieval System","authors":"Da Xu, Chuanwei Ruan","doi":"10.1145/3488560.3501394","DOIUrl":"https://doi.org/10.1145/3488560.3501394","url":null,"abstract":"In the relatively short history of machine learning, the subtle balance between engineering and theoretical progress has been proved critical at various stages. The most recent wave of AI has brought to the IR community powerful techniques, particularly for pattern recognition. While many benefits from the burst of ideas as numerous tasks become algorithmically feasible, the balance is tilting toward the application side. The existing theoretical tools in IR can no longer explain, guide, and justify the newly-established methodologies. With no choices, we have to bet our design on black-box mechanisms that we only empirically understand. The consequences can be suffering: in stark contrast to how the IR industry has envisioned modern AI making life easier, many are experiencing increased confusion and costs in data manipulation, model selection, monitoring, censoring, and decision making. This reality is not surprising: without handy theoretical tools, we often lack principled knowledge of the pattern recognition model's expressivity, optimization property, generalization guarantee, and our decision-making process has to rely on over-simplified assumptions and human judgments from time to time. Facing all the challenges, we started researching advanced theoretical tools emerging from various domains that can potentially resolve modern IR problems. We encountered many impactful ideas and made several independent publications emphasizing different pieces. Time is now to bring the community a systematic tutorial on how we successfully adapt those tools and make significant progress in understanding, designing, and eventually productionize impactful IR systems. We emphasize systematicity because IR is a comprehensive discipline that touches upon particular aspects of learning, causal inference analysis, interactive (online) decision-making, etc. It thus requires systematic calibrations to render the actual usefulness of the imported theoretical tools to serve IR problems, as they usually exhibit unique structures and definitions. Therefore, we plan this tutorial to systematically demonstrate our learning and successful experience of using advanced theoretical tools for understanding and designing IR systems.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122796365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Learning Transferable Node Representations for Attribute Extraction from Web Documents 学习可转移节点表示用于Web文档的属性提取

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining Pub Date : 2022-02-11 DOI: 10.1145/3488560.3498424

Yichao Zhou, Ying Sheng, N. Vo, Nick Edmonds, Sandeep Tata

{"title":"Learning Transferable Node Representations for Attribute Extraction from Web Documents","authors":"Yichao Zhou, Ying Sheng, N. Vo, Nick Edmonds, Sandeep Tata","doi":"10.1145/3488560.3498424","DOIUrl":"https://doi.org/10.1145/3488560.3498424","url":null,"abstract":"Given a web page, extracting an object along with various attributes of interest (e.g. price, publisher, author, and genre for a book) can facilitate a variety of downstream applications such as large-scale knowledge base construction, e-commerce product search, and personalized recommendation. Prior approaches have either relied on computationally expensive visual feature engineering or required large amounts of training data to get to an acceptable precision. In this paper, we propose a novel method, LeArNing TransfErable node RepresentatioNs for Attribute Extraction (LANTERN), to tackle the problem. We model the problem as a tree node tagging task. The key insight is to learn a contextual representation for each node in the DOM tree where the context explicitly takes into account the tree structure of the neighborhood around the node. Experiments on the SWDE public dataset show that LANTERN outperforms the previous state-of-the-art (SOTA) by 1.44% (F1 score) with a dramatically simpler model architecture. Furthermore, we report that utilizing data from a different domain (for instance, using training data about web pages with cars to extract book objects) is surprisingly useful and helps beat the SOTA by a further 1.37%.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128285920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

The Pit Stop Problem: How to Plan Your Next Road Trip 停车问题:如何计划你的下一次公路旅行

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining Pub Date : 2022-02-11 DOI: 10.1145/3488560.3508495

Kostas Kollias

引用次数: 0

Scalable Graph Topology Learning via Spectral Densification 基于谱密度的可扩展图拓扑学习

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining Pub Date : 2022-02-11 DOI: 10.1145/3488560.3498480

Yongyu Wang, Zhiqiang Zhao, Zhuo Feng

{"title":"Scalable Graph Topology Learning via Spectral Densification","authors":"Yongyu Wang, Zhiqiang Zhao, Zhuo Feng","doi":"10.1145/3488560.3498480","DOIUrl":"https://doi.org/10.1145/3488560.3498480","url":null,"abstract":"Graph learning plays an important role in many data mining and machine learning tasks, such as manifold learning, data representation and analysis, dimensionality reduction, data clustering, and visualization, etc. In this work, we introduce a highly-scalable spectral graph densification approach (GRASPEL) for graph topology learning from data. By limiting the precision matrix to be a graph-Laplacian-like matrix, our approach aims to learn sparse undirected graphs from potentially high-dimensional input data. A very unique property of the graphs learned by GRASPEL is that the spectral embedding (or approximate effective-resistance) distances on the graph will encode the similarities between the original input data points. By leveraging high-performance spectral methods, sparse yet spectrally-robust graphs can be learned by identifying and including the most spectrally-critical edges into the graph. Compared with prior state-of-the-art graph learning approaches, GRASPEL is more scalable and allows substantially improving computing efficiency and solution quality of a variety of data mining and machine learning applications, such as manifold learning, spectral clustering (SC), and dimensionality reduction (DR).","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126669449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Diversified Query Generation Guided by Knowledge Graph 基于知识图谱的多样化查询生成

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining Pub Date : 2022-02-11 DOI: 10.1145/3488560.3498431

Xi Shen, Jiangjie Chen, Jiaze Chen, Chun Zeng, Yanghua Xiao

{"title":"Diversified Query Generation Guided by Knowledge Graph","authors":"Xi Shen, Jiangjie Chen, Jiaze Chen, Chun Zeng, Yanghua Xiao","doi":"10.1145/3488560.3498431","DOIUrl":"https://doi.org/10.1145/3488560.3498431","url":null,"abstract":"Relevant articles recommendation plays an important role in online news platforms. Directly displaying recalled articles by a search engine lacks a deep understanding of the article contents. Generating clickable queries, on the other hand, summarizes an article in various aspects, which can be henceforth utilized to better connect relevant articles. Most existing approaches for generating article queries, however, do not consider the diversity of queries or whether they are appealing enough, which are essential for boosting user experience and platform drainage. To this end, we propose a Knowledge-Enhanced Diversified QuerY Generator (KEDY), which leverages an external knowledge graph (KG) as guidance. We diversify the query generation with the information of semantic neighbors of the entities in articles. We further constrain the diversification process with entity popularity knowledge to build appealing queries that users may be more interested in. The information within KG is propagated towards more popular entities with popularity-guided graph attention. We collect a news-query dataset from the search logs of a real-world search engine. Extensive experiments demonstrate our proposed KEDY can generate more diversified and insightful related queries than several strong baselines.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123321151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Unsupervised Cross-Domain Adaptation for Response Selection Using Self-Supervised and Adversarial Training 基于自监督和对抗训练的无监督跨域适应反应选择

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining Pub Date : 2022-02-11 DOI: 10.1145/3488560.3498404

Jia Li, Chongyang Tao, Huang Hu, Can Xu, Yining Chen, Daxin Jiang

{"title":"Unsupervised Cross-Domain Adaptation for Response Selection Using Self-Supervised and Adversarial Training","authors":"Jia Li, Chongyang Tao, Huang Hu, Can Xu, Yining Chen, Daxin Jiang","doi":"10.1145/3488560.3498404","DOIUrl":"https://doi.org/10.1145/3488560.3498404","url":null,"abstract":"Recently, many neural context-response matching models have been developed for retrieval-based dialogue systems. Although existing models achieve impressive performance through learning on a large amount of in-domain parallel dialogue data, they usually perform worse in another new domain. How to transfer a response retrieval model trained in high-resource domains to other low-resource domains is a crucial problem for scalable dialogue systems. To this end, we investigate the unsupervised cross-domain adaptation for response selection when the target domain has no parallel dialogue data. Specifically, we propose a two-stage method to adapt a response selection model to a new domain using self-supervised and adversarial training based on pre-trained language models (PLMs). To efficiently incorporate domain awareness and target-domain knowledge to PLMs, we first design a self-supervised post-training procedure, including domain discrimination (DD) task, target-domain masked language model (MLM) task and target-domain next sentence prediction (NSP) task. Based on this, we further conduct the adversarial fine-tuning to empower the model to match the proper response with extracted domain-shared features as much as possible. Experimental results show that our proposed method achieves consistent and significant improvements on several cross-domain response selection datasets.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123338957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Personalized Information Retrieval for Touristic Attractions in Augmented Reality 基于增强现实的旅游景点个性化信息检索

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining Pub Date : 2022-02-11 DOI: 10.1145/3488560.3502194

Felix Yang, Saikishore Kalloori, Ribin Chalumattu, Markus Gross

引用次数: 0