{"title":"Improving Session Search by Modeling Multi-Granularity Historical Query Change","authors":"Xiaochen Zuo, Zhicheng Dou, Ji-rong Wen","doi":"10.1145/3488560.3498415","DOIUrl":"https://doi.org/10.1145/3488560.3498415","url":null,"abstract":"In session search, it's important to utilize historical interactions between users and the search engines to improve document retrieval. However, not all historical information contributes to document ranking. Users often express their preferences in the process of modifying the previous query, which can help us catch useful information in the historical interactions. Inspired by it, we propose to model historical query change to improve document ranking performance. Especially, we characterize multi-granularity query change between each pair of adjacent queries at both term level and semantic level. For term level query change, we calculate three types of term weights, including the retained term weights, added term weights and removed term weights. Then we perform term-based interaction between the candidate document and historical queries based on the term weights. For semantic level query change, we calculate an overall representation of user intent by integrating the representations of each historical query obtained by different types of term weights. Then we adopt representation-based matching between this representation and the candidate document. To improve the effect of query change modeling, we introduce query change classification as an auxiliary task. Experimental results on AOL and TianGong-ST search logs show that our model outperforms most existing models for session search.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114986432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploration in Recommender Systems","authors":"Minmin Chen","doi":"10.1145/3488560.3510009","DOIUrl":"https://doi.org/10.1145/3488560.3510009","url":null,"abstract":"In the era of increasing choices, recommender systems are becoming indispensable in helping users navigate the million or billion pieces of content on recommendation platforms. Most of the recommender systems are powered by ML models trained on a large amount of user-item interaction data. Such a setup however induces a strong feedback loop that creates the rich gets richer phenomenon where head contents are getting more and more exposure while tail and fresh contents are not discovered. At the same time, it pigeonholes users to contents they are already familiar with. We believe exploration is key to break away from the feedback loop and to optimize long term user experience on recommendation platforms. The exploration-exploitation tradeoff, being the foundation of bandits and RL research, has been extensively studied in RL. While effective exploration is believed to positively influence the user experience on the platform, the exact value of exploration in recommender systems has not been well established. In this talk, we examine the roles of exploration in recommender systems in three facets: 1) system exploration to surface fresh/tail recommendations based on users' known interests; 2) user exploration to identify unknown user interests or introduce users to new interests; and 3) online exploration to utilize real-time user feedback to reduce extrapolation errors in performing system and user exploration. We discuss the challenges in measurements and optimization in different types of exploration, and propose initial solutions. We showcase how each aspect of exploration contributes to the long term user experience through offline and live experiments on industrial recommendation platforms. We hope this talk can inspire more follow up work in understanding and improving exploration in recommender systems.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125940935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Both, Paul Heinze, A. Perevalov, Johannes Richard Bartsch, Rostislav Iudin, Johannes Rudolf Herkner, Tim Schrader, Jonas Wunsch, René Gürth, Ann Kristin Falkenhain
{"title":"Quality Assurance of a German COVID-19 Question Answering Systems using Component-based Microbenchmarking","authors":"A. Both, Paul Heinze, A. Perevalov, Johannes Richard Bartsch, Rostislav Iudin, Johannes Rudolf Herkner, Tim Schrader, Jonas Wunsch, René Gürth, Ann Kristin Falkenhain","doi":"10.1145/3488560.3502196","DOIUrl":"https://doi.org/10.1145/3488560.3502196","url":null,"abstract":"Question Answering (QA) has become an often used method to retrieve data as part of chatbots and other natural-language user interfaces. In particular, QA systems of official institutions have high expectations regarding the answers computed by the system, as the provided information might be critical. In this demonstration, we use the official COVID-19 QA system that was developed together with the German Federal government to provide German citizens access to data regarding incident values, number of deaths, etc. To ensure high quality, a component-based approach was used that enables exchanging data between QA components using RDF and validating the functionality of the QA system using SPARQL. Here, we will demonstrate how our solution enables developers of QA systems to use a descriptive approach to validate the quality of their implementation before the system's deployment and also within a live environment.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125283016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modern Theoretical Tools for Understanding and Designing Next-generation Information Retrieval System","authors":"Da Xu, Chuanwei Ruan","doi":"10.1145/3488560.3501394","DOIUrl":"https://doi.org/10.1145/3488560.3501394","url":null,"abstract":"In the relatively short history of machine learning, the subtle balance between engineering and theoretical progress has been proved critical at various stages. The most recent wave of AI has brought to the IR community powerful techniques, particularly for pattern recognition. While many benefits from the burst of ideas as numerous tasks become algorithmically feasible, the balance is tilting toward the application side. The existing theoretical tools in IR can no longer explain, guide, and justify the newly-established methodologies. With no choices, we have to bet our design on black-box mechanisms that we only empirically understand. The consequences can be suffering: in stark contrast to how the IR industry has envisioned modern AI making life easier, many are experiencing increased confusion and costs in data manipulation, model selection, monitoring, censoring, and decision making. This reality is not surprising: without handy theoretical tools, we often lack principled knowledge of the pattern recognition model's expressivity, optimization property, generalization guarantee, and our decision-making process has to rely on over-simplified assumptions and human judgments from time to time. Facing all the challenges, we started researching advanced theoretical tools emerging from various domains that can potentially resolve modern IR problems. We encountered many impactful ideas and made several independent publications emphasizing different pieces. Time is now to bring the community a systematic tutorial on how we successfully adapt those tools and make significant progress in understanding, designing, and eventually productionize impactful IR systems. We emphasize systematicity because IR is a comprehensive discipline that touches upon particular aspects of learning, causal inference analysis, interactive (online) decision-making, etc. It thus requires systematic calibrations to render the actual usefulness of the imported theoretical tools to serve IR problems, as they usually exhibit unique structures and definitions. Therefore, we plan this tutorial to systematically demonstrate our learning and successful experience of using advanced theoretical tools for understanding and designing IR systems.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122796365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yichao Zhou, Ying Sheng, N. Vo, Nick Edmonds, Sandeep Tata
{"title":"Learning Transferable Node Representations for Attribute Extraction from Web Documents","authors":"Yichao Zhou, Ying Sheng, N. Vo, Nick Edmonds, Sandeep Tata","doi":"10.1145/3488560.3498424","DOIUrl":"https://doi.org/10.1145/3488560.3498424","url":null,"abstract":"Given a web page, extracting an object along with various attributes of interest (e.g. price, publisher, author, and genre for a book) can facilitate a variety of downstream applications such as large-scale knowledge base construction, e-commerce product search, and personalized recommendation. Prior approaches have either relied on computationally expensive visual feature engineering or required large amounts of training data to get to an acceptable precision. In this paper, we propose a novel method, LeArNing TransfErable node RepresentatioNs for Attribute Extraction (LANTERN), to tackle the problem. We model the problem as a tree node tagging task. The key insight is to learn a contextual representation for each node in the DOM tree where the context explicitly takes into account the tree structure of the neighborhood around the node. Experiments on the SWDE public dataset show that LANTERN outperforms the previous state-of-the-art (SOTA) by 1.44% (F1 score) with a dramatically simpler model architecture. Furthermore, we report that utilizing data from a different domain (for instance, using training data about web pages with cars to extract book objects) is surprisingly useful and helps beat the SOTA by a further 1.37%.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128285920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Pit Stop Problem: How to Plan Your Next Road Trip","authors":"Kostas Kollias","doi":"10.1145/3488560.3508495","DOIUrl":"https://doi.org/10.1145/3488560.3508495","url":null,"abstract":"Many online trip planning and navigation software need to routinely solve the problem of deciding where to take stops during a journey for various services such as refueling (or EV charging), rest stops, food, etc. The goal is to minimize the overhead of these stops while ensuring that the traveler is not starved of any essential resource (such as fuel, rest, or food) during the journey. In this paper, we formally model this problem and call it the pit stop problem. We design algorithms for this problem under various settings: single vs multiple types of stops, and offline vs online optimization (i.e., in advance of or during the trip). Our algorithms achieve provable guarantees in terms of approximating the optimal solution. We then extensively evaluate our algorithms on real world data and demonstrate that they significantly outperform baseline solutions.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124692379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalable Graph Topology Learning via Spectral Densification","authors":"Yongyu Wang, Zhiqiang Zhao, Zhuo Feng","doi":"10.1145/3488560.3498480","DOIUrl":"https://doi.org/10.1145/3488560.3498480","url":null,"abstract":"Graph learning plays an important role in many data mining and machine learning tasks, such as manifold learning, data representation and analysis, dimensionality reduction, data clustering, and visualization, etc. In this work, we introduce a highly-scalable spectral graph densification approach (GRASPEL) for graph topology learning from data. By limiting the precision matrix to be a graph-Laplacian-like matrix, our approach aims to learn sparse undirected graphs from potentially high-dimensional input data. A very unique property of the graphs learned by GRASPEL is that the spectral embedding (or approximate effective-resistance) distances on the graph will encode the similarities between the original input data points. By leveraging high-performance spectral methods, sparse yet spectrally-robust graphs can be learned by identifying and including the most spectrally-critical edges into the graph. Compared with prior state-of-the-art graph learning approaches, GRASPEL is more scalable and allows substantially improving computing efficiency and solution quality of a variety of data mining and machine learning applications, such as manifold learning, spectral clustering (SC), and dimensionality reduction (DR).","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126669449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xi Shen, Jiangjie Chen, Jiaze Chen, Chun Zeng, Yanghua Xiao
{"title":"Diversified Query Generation Guided by Knowledge Graph","authors":"Xi Shen, Jiangjie Chen, Jiaze Chen, Chun Zeng, Yanghua Xiao","doi":"10.1145/3488560.3498431","DOIUrl":"https://doi.org/10.1145/3488560.3498431","url":null,"abstract":"Relevant articles recommendation plays an important role in online news platforms. Directly displaying recalled articles by a search engine lacks a deep understanding of the article contents. Generating clickable queries, on the other hand, summarizes an article in various aspects, which can be henceforth utilized to better connect relevant articles. Most existing approaches for generating article queries, however, do not consider the diversity of queries or whether they are appealing enough, which are essential for boosting user experience and platform drainage. To this end, we propose a Knowledge-Enhanced Diversified QuerY Generator (KEDY), which leverages an external knowledge graph (KG) as guidance. We diversify the query generation with the information of semantic neighbors of the entities in articles. We further constrain the diversification process with entity popularity knowledge to build appealing queries that users may be more interested in. The information within KG is propagated towards more popular entities with popularity-guided graph attention. We collect a news-query dataset from the search logs of a real-world search engine. Extensive experiments demonstrate our proposed KEDY can generate more diversified and insightful related queries than several strong baselines.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123321151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised Cross-Domain Adaptation for Response Selection Using Self-Supervised and Adversarial Training","authors":"Jia Li, Chongyang Tao, Huang Hu, Can Xu, Yining Chen, Daxin Jiang","doi":"10.1145/3488560.3498404","DOIUrl":"https://doi.org/10.1145/3488560.3498404","url":null,"abstract":"Recently, many neural context-response matching models have been developed for retrieval-based dialogue systems. Although existing models achieve impressive performance through learning on a large amount of in-domain parallel dialogue data, they usually perform worse in another new domain. How to transfer a response retrieval model trained in high-resource domains to other low-resource domains is a crucial problem for scalable dialogue systems. To this end, we investigate the unsupervised cross-domain adaptation for response selection when the target domain has no parallel dialogue data. Specifically, we propose a two-stage method to adapt a response selection model to a new domain using self-supervised and adversarial training based on pre-trained language models (PLMs). To efficiently incorporate domain awareness and target-domain knowledge to PLMs, we first design a self-supervised post-training procedure, including domain discrimination (DD) task, target-domain masked language model (MLM) task and target-domain next sentence prediction (NSP) task. Based on this, we further conduct the adversarial fine-tuning to empower the model to match the proper response with extracted domain-shared features as much as possible. Experimental results show that our proposed method achieves consistent and significant improvements on several cross-domain response selection datasets.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123338957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Felix Yang, Saikishore Kalloori, Ribin Chalumattu, Markus Gross
{"title":"Personalized Information Retrieval for Touristic Attractions in Augmented Reality","authors":"Felix Yang, Saikishore Kalloori, Ribin Chalumattu, Markus Gross","doi":"10.1145/3488560.3502194","DOIUrl":"https://doi.org/10.1145/3488560.3502194","url":null,"abstract":"The rapid advances and increasing accessibility of augmented reality (AR) in recent years opened up many new possibilities to incorporate AR into our daily lives. A very interesting area for AR is tourism where one can enhance attractions with virtual elements and provide tourists with additional information about the places they are visiting. In this paper, we present our prototype, an AR application that augments various points of interest (POIs) by showing images and facts about each POI. We also developed a simple recommender system that ensures the facts are selected based on user preferences, thus creating a unique and personalized experience for each user. Furthermore, we also conducted a live user study to assess the usability of our prototype and the usefulness of our personalization system.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116184347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}