{"title":"Term Discrimination Value for Cross-Language Information Retrieval","authors":"Ali Montazeralghaem, Razieh Rahimi, J. Allan","doi":"10.1145/3341981.3344252","DOIUrl":null,"url":null,"abstract":"Term discrimination value is among the three basic heuristics exploited, directly or indirectly, in almost all ranking models for ad-hoc Information Retrieval (IR). Query term discrimination in monolingual IR is usually estimated based on document or collection frequency of terms. In the query translation approach for CLIR, the discrimination value of a query term needs to be estimated based on document or collection frequencies of its translations, which is more challenging. We show that the existing estimation models do not correctly estimate and adequately reflect the difference between the discrimination power of query terms, which hurts retrieval performance. We then propose a new model to estimate discrimination values of query terms for CLIR and empirically demonstrate its impact in improving the CLIR performance.","PeriodicalId":173154,"journal":{"name":"Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3341981.3344252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Term discrimination value is among the three basic heuristics exploited, directly or indirectly, in almost all ranking models for ad-hoc Information Retrieval (IR). Query term discrimination in monolingual IR is usually estimated based on document or collection frequency of terms. In the query translation approach for CLIR, the discrimination value of a query term needs to be estimated based on document or collection frequencies of its translations, which is more challenging. We show that the existing estimation models do not correctly estimate and adequately reflect the difference between the discrimination power of query terms, which hurts retrieval performance. We then propose a new model to estimate discrimination values of query terms for CLIR and empirically demonstrate its impact in improving the CLIR performance.