{"title":"Ranking under temporal constraints","authors":"Lidan Wang, Donald Metzler, Jimmy J. Lin","doi":"10.1145/1871437.1871452","DOIUrl":"https://doi.org/10.1145/1871437.1871452","url":null,"abstract":"This paper introduces the notion of temporally constrained ranked retrieval, which, given a query and a time constraint, produces the best possible ranked list within the specified time limit. Naturally, more time should translate into better results, but the ranking algorithm should always produce some results. This property is desirable from a number of perspectives: to cope with diverse users and information needs, as well as to better manage system load and variance in query execution times. We propose two temporally constrained ranking algorithms based on a class of probabilistic prediction models that can naturally incorporate efficiency constraints: one that makes independent feature selection decisions, and the other that makes joint feature selection decisions. Experiments on three different test collections show that both ranking algorithms are able to satisfy imposed time constraints, although the joint model outperforms the independent model in being able to deliver more effective results, especially under tight time constraints, due to its ability to capture feature dependencies.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115607519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantifying uncertainty in multi-dimensional cardinality estimations","authors":"Andranik Khachatryan, Klemens Böhm","doi":"10.1145/1871437.1871610","DOIUrl":"https://doi.org/10.1145/1871437.1871610","url":null,"abstract":"We propose a method for predicting the cardinality distribution of a multi-dimensional query. Compared to conventional 'point-based' estimates, distribution-based estimates enable the query optimizer to predict the cost of a query plan more accurately, as we show experimentally. Our method is computationally efficient and works on top of a histogram already in place. It does not store any information additional to the histogram. Our experiments show that the quality of the predictions with the new method is high.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123092939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WikiPop: personalized event detection system based on Wikipedia page view statistics","authors":"M. Ciglan, K. Nørvåg","doi":"10.1145/1871437.1871769","DOIUrl":"https://doi.org/10.1145/1871437.1871769","url":null,"abstract":"In this paper, we describe WikiPop service, a system designed to detect significant increase of popularity of topics related to users' interests. We exploit Wikipedia page view statistics to identify concepts with significant increase of the interest from the public. Daily, there are thousands of articles with increased popularity; thus, a personalization is in order to provide the user only with results related to his/her interest. The WikiPop system allows a user to define a context by stating a set of Wikipedia articles describing topics of interest. The system is then able to search, for the given date, for popular topics related to the user defined context.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116682779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Concurrent atomic protocols for making and changing decisions in social networks","authors":"Royi Ronen, O. Shmueli","doi":"10.1145/1871437.1871615","DOIUrl":"https://doi.org/10.1145/1871437.1871615","url":null,"abstract":"We study a novel data management scenario, in which social networks participants use protocols in order to manage their activities and the ever-growing data available to them in the network. In particular, we study protocols which operate on a consistent network (that we define), and transform it into another consistent state by atomically performing a set of changes. Multiple protocol instances, which work on intersecting parts of the network graphs are able to operate concurrently.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"277 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121257999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gianluca Demartini, M. S. Missen, Roi Blanco, H. Zaragoza
{"title":"TAER: time-aware entity retrieval-exploiting the past to find relevant entities in news articles","authors":"Gianluca Demartini, M. S. Missen, Roi Blanco, H. Zaragoza","doi":"10.1145/1871437.1871661","DOIUrl":"https://doi.org/10.1145/1871437.1871661","url":null,"abstract":"Retrieving entities instead of just documents has become an important task for search engines. In this paper we study entity retrieval for news applications, and in particular the importance of the news trail history (i.e., past related articles) in determining the relevant entities in current articles. This is an important problem in applications that display retrieved entities to the user, together with the news article. We analyze and discuss some statistics about entities in news trails, unveiling some unknown findings such as the persistence of relevance over time. We focus on the task of query dependent entity retrieval over time. For this task we evaluate several features, and show that their combinations significantly improves performance.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125896365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning to rank relevant and novel documents through user feedback","authors":"A. Lad, Yiming Yang","doi":"10.1145/1871437.1871499","DOIUrl":"https://doi.org/10.1145/1871437.1871499","url":null,"abstract":"We consider the problem of learning to rank relevant and novel documents so as to directly maximize a performance metric called Expected Global Utility (EGU), which has several desirable properties: (i) It measures retrieval performance in terms of relevant as well as novel information, (ii) gives more importance to top ranks to reflect common browsing behavior of users, as opposed to existing objective functions based on set-coverage, (iii) accommodates different levels of tolerance towards redundancy, which is not taken into account by existing evaluation measures, and (iv) extends naturally to the evaluation of session-based retrieval comprising multiple ranked lists. Our ground truth is defined in terms of \"information nuggets\", which are obviously not known to the retrieval system when processing a new user query. Therefore, our approach uses observable query and document features (words and named entities) as surrogates for nuggets, whose weights are learned based on user feedback in an iterative search session. The ranked list is produced to maximize the weighted coverage of these surrogate nuggets. The optimization of such coverage-based metrics is known to be NP-hard. Therefore, we use a greedy algorithm and show that it guarantees good performance due to the submodularity of the objective function. Our experiments on Topic Detection and Tracking data show that the proposed approach represents an efficient and effective retrieval strategy for maximizing EGU, as compared to a purely-relevance based ranking approach that uses Indri, as well as a MMR-based approach for non-redundant ranking.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126620433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Injecting domain knowledge into a granular database engine: a position paper","authors":"D. Ślęzak, Graham Toppin","doi":"10.1145/1871437.1871762","DOIUrl":"https://doi.org/10.1145/1871437.1871762","url":null,"abstract":"We discuss how to use techniques from such fields as text processing and knowledge management to better handle text attributes in the Infobright's RDBMS engine. Our approach leads to a rich interface for domain experts who wish to share their knowledge about data content and, on the other hand, it remains unnoticeable to data users. It enables to improve data storage, data access, and data compression, with no changes required at the database schema level.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126622529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-information fusion for uncertain semantic representations of videos","authors":"Bo Lu, Guoren Wang, Xiao-Yu Gong","doi":"10.1145/1871437.1871684","DOIUrl":"https://doi.org/10.1145/1871437.1871684","url":null,"abstract":"Concept-Based Semantic Video Retrieval(CBSVR) usually uses semantic representations of videos to handle user's retrieval requests. It is obvious that the accuracy of semantic video retrieval depends on results of concept detectors, but the detection results are usually imprecise and uncertain . In this paper, we propose a multi-information fusion approach (MIF) which is dedicated to solving the problem of uncertain semantic representations of videos for improving retrieval accuracy. This approach is based on a novel two-phase framework that involves the inferring phase and the fusing phase. In the inferring phase, the most relevant concepts to the user's query are chosen by exploring both contextual correlation among concepts and temporal correlation among shots. In the fusing phase, the inferred probabilities of the related concepts are fused together with the detection results via minimization of potential function to refine the detector prediction. Experiments on the widely used TRECVID datasets demonstrate that our approach can effectively improve the accuracy of semantic concept detection.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122046738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Makoto P. Kato, H. Ohshima, S. Oyama, Katsumi Tanaka
{"title":"Search as if you were in your home town: geographic search by regional context and dynamic feature-space selection","authors":"Makoto P. Kato, H. Ohshima, S. Oyama, Katsumi Tanaka","doi":"10.1145/1871437.1871667","DOIUrl":"https://doi.org/10.1145/1871437.1871667","url":null,"abstract":"We propose a query-by-example geographic object search method for users that do not know well about the place they are in. Geographic objects, such as restaurants, are often retrieved using an attribute-based or keyword query. These queries, however, are difficult to use for users that have little knowledge on the place where they want to search. The proposed query-by-example method allows users to query by selecting examples in familiar places for retrieving objects in unfamiliar places. One of the challenges is to predict an effective distance metric, which varies for individuals. Another challenge is to calculate the distance between objects in heterogeneous domains considering the feature gap between them, for example, restaurants in Japan and China. Our proposed method is used to robustly estimate the distance metric by amplifying the difference between selected and non-selected examples. By using the distance metric, each object in a familiar domain is evenly assigned to one in an unfamiliar domain to eliminate the difference between those domains. We developed a restaurant search using data obtained from a Japanese restaurant Web guide to evaluate our method.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122102585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Use of semantics in real life applications","authors":"G. Grefenstette","doi":"10.1145/1871437.1871441","DOIUrl":"https://doi.org/10.1145/1871437.1871441","url":null,"abstract":"Semantics has many different definitions in science. In natural language processing, there has been much research over the past three decades involving extracting the semantics, the meaning, of natural texts. This has led to entity recognition (people, places, companies, prices, dates, and events), and more recently into sentiment analysis, exploring another level of meaning in a text. These techniques are now well understood and robust. Results of this research are beginning to appear in products and online sites, finding their way into practical applications. The stage is set for an explosion of semantically savvy applications, from 3D design, to enhanced web browsing, to social network aware yellowpages. This talk will explore these paths from research to industry, illustrated by current products on the market.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122145430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}