Shahzad Rajput, Matthew Ekstrand-Abueg, Virgil Pavlu, J. Aslam
{"title":"通过提取相关信息推断文档的相关性来构建测试集合","authors":"Shahzad Rajput, Matthew Ekstrand-Abueg, Virgil Pavlu, J. Aslam","doi":"10.1145/2396761.2396783","DOIUrl":null,"url":null,"abstract":"The goal of a typical information retrieval system is to satisfy a user's information need---e.g., by providing an answer or information \"nugget\"---while the actual search space of a typical information retrieval system consists of documents---i.e., collections of nuggets. In this paper, we characterize this relationship between nuggets and documents and discuss applications to system evaluation. In particular, for the problem of test collection construction for IR system evaluation, we demonstrate a highly efficient algorithm for simultaneously obtaining both relevant documents and relevant information. Our technique exploits the mutually reinforcing relationship between relevant documents and relevant information, yielding document-based test collections whose efficiency and efficacy exceed those of typical Cranfield-style test collections, while also generating sets of highly relevant information.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Constructing test collections by inferring document relevance via extracted relevant information\",\"authors\":\"Shahzad Rajput, Matthew Ekstrand-Abueg, Virgil Pavlu, J. Aslam\",\"doi\":\"10.1145/2396761.2396783\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The goal of a typical information retrieval system is to satisfy a user's information need---e.g., by providing an answer or information \\\"nugget\\\"---while the actual search space of a typical information retrieval system consists of documents---i.e., collections of nuggets. In this paper, we characterize this relationship between nuggets and documents and discuss applications to system evaluation. In particular, for the problem of test collection construction for IR system evaluation, we demonstrate a highly efficient algorithm for simultaneously obtaining both relevant documents and relevant information. Our technique exploits the mutually reinforcing relationship between relevant documents and relevant information, yielding document-based test collections whose efficiency and efficacy exceed those of typical Cranfield-style test collections, while also generating sets of highly relevant information.\",\"PeriodicalId\":313414,\"journal\":{\"name\":\"Proceedings of the 21st ACM international conference on Information and knowledge management\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 21st ACM international conference on Information and knowledge management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2396761.2396783\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st ACM international conference on Information and knowledge management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2396761.2396783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Constructing test collections by inferring document relevance via extracted relevant information
The goal of a typical information retrieval system is to satisfy a user's information need---e.g., by providing an answer or information "nugget"---while the actual search space of a typical information retrieval system consists of documents---i.e., collections of nuggets. In this paper, we characterize this relationship between nuggets and documents and discuss applications to system evaluation. In particular, for the problem of test collection construction for IR system evaluation, we demonstrate a highly efficient algorithm for simultaneously obtaining both relevant documents and relevant information. Our technique exploits the mutually reinforcing relationship between relevant documents and relevant information, yielding document-based test collections whose efficiency and efficacy exceed those of typical Cranfield-style test collections, while also generating sets of highly relevant information.