Proceedings of the 2015 International Conference on The Theory of Information Retrieval最新文献_第6页

Anytime Ranking for Impact-Ordered Indexes 影响排序索引的任何时间排序

Proceedings of the 2015 International Conference on The Theory of Information Retrieval Pub Date : 2015-09-27 DOI: 10.1145/2808194.2809477

Jimmy J. Lin, A. Trotman

引用次数: 52

Language-independent Query Representation for IR Model Parameter Estimation on Unlabeled Collections 非标记集合IR模型参数估计的语言无关查询表示

Proceedings of the 2015 International Conference on The Theory of Information Retrieval Pub Date : 2015-09-27 DOI: 10.1145/2808194.2809451

Parantapa Goswami, Massih-Reza Amini, Éric Gaussier

{"title":"Language-independent Query Representation for IR Model Parameter Estimation on Unlabeled Collections","authors":"Parantapa Goswami, Massih-Reza Amini, Éric Gaussier","doi":"10.1145/2808194.2809451","DOIUrl":"https://doi.org/10.1145/2808194.2809451","url":null,"abstract":"We study here the problem of estimating the parameters of standard IR models (as BM25 or language models) on new collections without any relevance judgments, by using collections with already available relevance judgements. We propose different query representations that allow mapping queries (with and without relevance judgments, from different collections, potentially in different languages) into a common space. We then introduce a kernel regression approach to learn the parameters of standard IR models individually for each query in the new, unlabeled collection. Our experiments, conducted on standard English and Indian IR collections, show that our approach can be used to efficiently tune, query by query, standard IR models to new collections, potentially written in different languages. In particular, the versions of the standard IR models we obtain not only outperform the versions with default parameters, but can also outperform the versions in which the parameter values have been optimized globally over a set of queries with target relevance judgements.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121734739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Entropy and Graph Based Modelling of Document Coherence using Discourse Entities: An Application to IR 基于熵和图的基于篇章实体的文档一致性建模:在IR中的应用

Proceedings of the 2015 International Conference on The Theory of Information Retrieval Pub Date : 2015-07-29 DOI: 10.1145/2808194.2809458

Casper Petersen, C. Lioma, J. Simonsen, Birger Larsen

{"title":"Entropy and Graph Based Modelling of Document Coherence using Discourse Entities: An Application to IR","authors":"Casper Petersen, C. Lioma, J. Simonsen, Birger Larsen","doi":"10.1145/2808194.2809458","DOIUrl":"https://doi.org/10.1145/2808194.2809458","url":null,"abstract":"We present two novel models of document coherence and their application to information retrieval (IR). Both models approximate document coherence using discourse entities, e.g. the subject or object of a sentence. Our first model views text as a Markov process generating sequences of discourse entities (entity n-grams); we use the entropy of these entity n-grams to approximate the rate at which new information appears in text, reasoning that as more new words appear, the topic increasingly drifts and text coherence decreases. Our second model extends the work of Guinaudeau & Strube [28] that represents text as a graph of discourse entities, linked by different relations, such as their distance or adjacency in text. We use several graph topology metrics to approximate different aspects of the discourse flow that can indicate coherence, such as the average clustering or betweenness of discourse entities in text. Experiments with several instantiations of these models show that: (i) our models perform on a par with two other well-known models of text coherence even without any parameter tuning, and (ii) reranking retrieval results according to their coherence scores gives notable performance gains, confirming a relation between document coherence and relevance. This work contributes two novel models of document coherence, the application of which to IR complements recent work in the integration of document cohesiveness or comprehensibility to ranking [5, 56].","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133098464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Statistical Significance Testing in Information Retrieval: Theory and Practice 信息检索中的统计显著性检验:理论与实践

Proceedings of the 2015 International Conference on The Theory of Information Retrieval Pub Date : 2014-07-03 DOI: 10.1145/2808194.2809445

Ben Carterette

{"title":"Statistical Significance Testing in Information Retrieval: Theory and Practice","authors":"Ben Carterette","doi":"10.1145/2808194.2809445","DOIUrl":"https://doi.org/10.1145/2808194.2809445","url":null,"abstract":"The past 20 years have seen a great improvement in the rigor of information retrieval experimentation, due primarily to two factors: high-quality, public, portable test collections such as those produced by TREC (the Text REtrieval Conference [28]), and the increased practice of sta- tistical hypothesis testing to determine whether measured improvements can be ascribed to something other than random chance. Together these create a very useful standard for reviewers, program committees, and journal editors; work in information retrieval (IR) increasingly cannot be published unless it has been evaluated using a well-constructed test collection and shown to produce a statistically significant improvement over a good baseline. But, as the saying goes, any tool sharp enough to be useful is also sharp enough to be dangerous. Statistical tests of significance are widely misunderstood. Most researchers and developers treat them as a \"black box\": evaluation results go in and a p-value comes out. But because significance is such an important factor in determining what research directions to explore and what is published, using p-values obtained without thought can have consequences for everyone doing research in IR. Ioannidis has argued that the main consequence in the biomedical sciences is that most published research findings are false [12]; could that be the case in IR as well?","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115375602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Proceedings of the 2015 International Conference on The Theory of Information Retrieval 2015信息检索理论国际学术会议论文集

Proceedings of the 2015 International Conference on The Theory of Information Retrieval Pub Date : 1900-01-01 DOI: 10.1145/2808194

引用次数: 3