{"title":"Pseudo Descriptions for Meta-Data Retrieval","authors":"Tim Gollub, E. Genc, Nedim Lipka, Benno Stein","doi":"10.1145/3234944.3234957","DOIUrl":null,"url":null,"abstract":"Search in meta-data is challenging due to the sparsity of the available textual information. To alleviate the sparsity problem, the paper in hand evolves from the existing document expansion paradigm and proposes pseudo-descriptions as a new paradigm. Instead of encoding paradigmatic term relations implicitly in an expansion vector, we generate an explicit cohesive text field for meta-data records that describes the entity associated with the record. In contrast to document expansions, pseudo-descriptions allow to reveal why a certain document is considered relevant although the original meta-data does not contain the query terms. Moreover, they are easier to operationalize and facilitate the use of sophisticated retrieval features such as phrase search and query term proximity. To generate pseudo-descriptions, we propose a relevance dependent strategy that depends on the search engine result pages obtained from issuing the meta-data as a search query to a designated reference search engine. To demonstrate the validity of the pseudo-description paradigm, we experiment with different TREC collections where we withhold the content information to simulate a meta-data retrieval scenario. Though retrieval with full content information remains superior, our approach achieves retrieval performance improvements en par with document expansion.","PeriodicalId":193631,"journal":{"name":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3234944.3234957","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Search in meta-data is challenging due to the sparsity of the available textual information. To alleviate the sparsity problem, the paper in hand evolves from the existing document expansion paradigm and proposes pseudo-descriptions as a new paradigm. Instead of encoding paradigmatic term relations implicitly in an expansion vector, we generate an explicit cohesive text field for meta-data records that describes the entity associated with the record. In contrast to document expansions, pseudo-descriptions allow to reveal why a certain document is considered relevant although the original meta-data does not contain the query terms. Moreover, they are easier to operationalize and facilitate the use of sophisticated retrieval features such as phrase search and query term proximity. To generate pseudo-descriptions, we propose a relevance dependent strategy that depends on the search engine result pages obtained from issuing the meta-data as a search query to a designated reference search engine. To demonstrate the validity of the pseudo-description paradigm, we experiment with different TREC collections where we withhold the content information to simulate a meta-data retrieval scenario. Though retrieval with full content information remains superior, our approach achieves retrieval performance improvements en par with document expansion.