{"title":"Controversy Detection and Stance Analysis","authors":"Shiri Dori-Hacohen","doi":"10.1145/2766462.2767844","DOIUrl":"https://doi.org/10.1145/2766462.2767844","url":null,"abstract":"Alerting users about controversial search results can encourage critical literacy, promote healthy civic discourse and counteract the \"filter bubble\" effect. Additionally, presenting information to the user about the different stances or sides of the debate can help her navigate the landscape of search results. Our existing work made strides in the emerging niche of controversy detection and analysis; we propose further work on automatic stance detection.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127097568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Contextual Information to Understand Searching and Browsing Behavior","authors":"Julia Kiseleva","doi":"10.1145/2766462.2767852","DOIUrl":"https://doi.org/10.1145/2766462.2767852","url":null,"abstract":"There is great imbalance in the richness of information on the web and the succinctness and poverty of search requests of web users, making their queries only a partial description of the underlying complex information needs. Finding ways to better leverage contextual information and make search context-aware holds the promise to dramatically improve the search experience of users. We conducted a series of studies to discover, model and utilize contextual information in order to understand and improve users' searching and browsing behavior on the web. Our results capture important aspects of context under the realistic conditions of different online search services, aiming to ensure that our scientific insights and solutions transfer to the operational settings of real world applications.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129097449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kazuo Hara, Ikumi Suzuki, Kei Kobayashi, K. Fukumizu
{"title":"Reducing Hubness: A Cause of Vulnerability in Recommender Systems","authors":"Kazuo Hara, Ikumi Suzuki, Kei Kobayashi, K. Fukumizu","doi":"10.1145/2766462.2767823","DOIUrl":"https://doi.org/10.1145/2766462.2767823","url":null,"abstract":"It is known that memory-based collaborative filtering systems are vulnerable to shilling attacks. In this paper, we demonstrate that hubness, which occurs in high dimensional data, is exploited by the attacks. Hence we explore methods for reducing hubness in user-response data to make these systems robust against attacks. Using the MovieLens dataset, we empirically show that the two methods for reducing hubness by transforming a similarity matrix(i) centering and (ii) conversion to a commute time kernel-can thwart attacks without degrading the recommendation performance.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133009891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spoken Conversational Search: Information Retrieval over a Speech-only Communication Channel","authors":"Johanne R. Trippas","doi":"10.1145/2766462.2767850","DOIUrl":"https://doi.org/10.1145/2766462.2767850","url":null,"abstract":"This research is investigating a new interaction paradigm for Interactive Information Retrieval (IIR), where all input and output is mediated via speech. While such information systems have been important for the visually impaired for many years, a renewed focus on speech is driven by the growing sales of internet enabled mobile devices. Presenting search results over a speech-only communication channel involves a number of challenges for users due to cognitive limitations and the serial nature of the audio channel [2]. Other research has shown that one cannot just ‘bolt on’ speech recognizers and screen readers to an existing system [5]. Therefore the aim of this research is to develop a new framework for effective and efficient IIR over a speech-only channel: a Spoken Conversational Search System (SCSS) which provides a conversational approach to determining user information needs, presenting results and enabling search reformulations. This research will go beyond current Voice Search approaches by aiming for a greater integration between document search and conversational dialogue processes in order to provide a more efficient and effective search experience when using a SCSS. We will also investigate an information seeking model for audio and language models. Presenting a Search Engine Result Page (SERP) over a speechonly communication channel presents a number of challenges, e.g., the textual component of a standard search results list has been shown to be ineffectual [4]. The transient nature of speech poses problems due to memory constraints, and makes the possibility of “skimming” back and forth over a list of results (a standard process in browsing a visual list) difficult. These issues are greatly exacerbated when the result being sought is further down the list. This research will advance the knowledge base by: Providing an understanding of which strategies and IIR techniques for SCSS are best for users. Defining novel technologies for contextual conversational interaction with a large collection of unstructured documents that supports effective search over a speech-only communication channel (audio). Determining new methods for providing summary-based resultpresentation for unstructured documents.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130796587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modularity-Based Query Clustering for Identifying Users Sharing a Common Condition","authors":"Maayan Harel, E. Yom-Tov","doi":"10.1145/2766462.2767798","DOIUrl":"https://doi.org/10.1145/2766462.2767798","url":null,"abstract":"We present an algorithm for identifying users who share a common condition from anonymized search engine logs. Input to the algorithm is a set of seed phrases that identify users with the condition of interest with high precision albeit at a very low recall. We expand the set of seed phrases by clustering queries according to the pages users clicked following these queries and the temporal ordering of queries within sessions, emphasizing the subgraph containing seed phrases. To this end, we extend modularity-based clustering such that it uses the information in the initial seed phrases as well as other queries of users in the population of interest. We evaluate the performance of the proposed method on two datasets, one of mood disorders and the other of anorexia, by classifying users according to the clusters in which they appeared and the phrases contained thereof, and show that the area under the receiver operating characteristic curve (AUC) obtained by these methods exceeds 0.87. These results demonstrate the value of our algorithm for both identifying users for future research and to gain better understanding of the language associated with the condition.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127845348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tailoring Music Recommendations to Users by Considering Diversity, Mainstreaminess, and Novelty","authors":"M. Schedl, D. Hauger","doi":"10.1145/2766462.2767763","DOIUrl":"https://doi.org/10.1145/2766462.2767763","url":null,"abstract":"A shortcoming of current approaches for music recommendation is that they consider user-specific characteristics only on a very simple level, typically as some kind of interaction between users and items when employing collaborative filtering. To alleviate this issue, we propose several user features that model aspects of the user's music listening behavior: diversity, mainstreaminess, and novelty of the user's music taste. To validate the proposed features, we conduct a comprehensive evaluation of a variety of music recommendation approaches (stand-alone and hybrids) on a collection of almost 200 million listening events gathered from propername{Last.fm}. We report first results and highlight cases where our diversity, mainstreaminess, and novelty features can be beneficially integrated into music recommender systems.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131673558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Matrix Factorization and Manifold-Ranking for Topic-Focused Multi-Document Summarization","authors":"Jiwei Tan, Xiaojun Wan, Jianguo Xiao","doi":"10.1145/2766462.2767765","DOIUrl":"https://doi.org/10.1145/2766462.2767765","url":null,"abstract":"Manifold-ranking has proved to be an effective method for topic-focused multi-document summarization. As basic manifold-ranking based summarization method constructs the relationships between sentences simply by the bag-of-words cosine similarity, we believe a better similarity metric will further improve the effectiveness of manifold-ranking. In this paper, we propose a joint optimization framework, which integrates the manifold-ranking process with a similarity metric learning process. The joint framework aims at learning better sentence similarity scores and better sentence ranking scores simultaneously. Experiments on DUC datasets show the proposed joint method achieves better performance than the manifold-ranking baselines and several popular methods.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128799225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Search using Proximity-Based Statistics","authors":"Xiaolu Lu","doi":"10.1145/2766462.2767847","DOIUrl":"https://doi.org/10.1145/2766462.2767847","url":null,"abstract":"Modern retrieval systems often use more sophisticated ranking models. Although more new features are added in, term proximity has been studied for a long time, and still plays an important role. A recent study by Huston and Croft [2] shows that many-term dependency is a better choice for a large corpus and long queries. However, utilizing proximity-based features often leads to computational overhead, and most of the existing solutions are tailored to term pairs. Fewer studies have focused on many-term proximity computation, and the plane-sweep approach proposed by Sadakane and Imai [6] is still state-of-the-art. Consider a multi-pass retrieval process where the proximity features could be an effective first pass ranker if we can reduce the cost of the proximity calculation. In this PhD project, we consider the following questions: (i) How important are the proximity statistics in the term dependency models and what is the cost of extracting the proximity features? (ii) Although all term dependencies are considered in ranking models, can we design an early termination strategy considering only partial proximity? Moreover, instead of viewing the term from the same level, can we utilizing its locality for obtaining more efficiency? (iii) How do we best organize the term proximity statistics to be more indexable, facilitating the extraction process? (iv) How do we best define the approximation form of term proximity in order to find the best trade-off between effectiveness and efficiency? In a preliminary experimental study, Lu et al. [3] compare how different term dependency components affect the entire ranking models show that although the phrase component helps to improve the effectiveness in an overall sense, it degrades dramatically on Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author(s). Copyright is held by the owner/author(s). SIGIR’15, August 09 13, 2015, Santiago, Chile. ACM 978-1-4503-3621-5/15/08. DOI:http://dx.doi.org/10.1145/2766462.2767847. some queries. Although the proximity part doesn’t always improve the effectiveness, it is more stable. From the computational perspective, we have found that extracting single term dependency proximity using the plane-sweep algorithm is not a bottleneck. But it is a computational intensive job when processing each dependency feature separately. However, the extra cost of considering proximity independently can be reduced by extracting all dependencies together [4]. Further, since most of retrieval systems keep both a direct file and an inverted file, it is possible to exploit both representation to maximize the efficiency. Although the cost of extracting proxim","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128811840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 2C: Graphs","authors":"J. Kamps","doi":"10.1145/3255920","DOIUrl":"https://doi.org/10.1145/3255920","url":null,"abstract":"","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"10 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120912915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Opportunities to Facilitate Serendipity in Search","authors":"Ataur Rahman, Max L. Wilson","doi":"10.1145/2766462.2767783","DOIUrl":"https://doi.org/10.1145/2766462.2767783","url":null,"abstract":"Serendipitously discovering new information can bring many benefits. Although we can design systems to highlight serendipitous information, serendipity cannot be easily orchestrated and is thus hard to study. In this paper, we deployed a working search engine that matched search results with Facebook 'Like' data, as a technology probe to examine naturally occurring serendipitous discoveries. Search logs and diary entries revealed the nature of these occasions in both leisure and work contexts. The findings support the use of the micro-serendipity model in search system design.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116229181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}