Hamed Bonab, Mohammad Aliannejadi, John Foley, J. Allan
{"title":"结合层次域信息消除极短查询歧义","authors":"Hamed Bonab, Mohammad Aliannejadi, John Foley, J. Allan","doi":"10.1145/3341981.3344251","DOIUrl":null,"url":null,"abstract":"Users often express their information needs using incomplete or ambiguous queries of only one or two terms in length, particularly in the Web environments. The ambiguity of short queries is a recognized problem for information retrieval (IR) systems. In this study, we investigate various approaches for incorporating hierarchical domain information into IR models such that the domain specification resolves the ambiguity. To this end, we develop practical models for constructing evaluation datasets from existing corpora. In terms of effectiveness, we further study the trade-off between a short query and its domain specification information. In doing so, we find that domains with the highest number of relevant documents are not always the best ones to select. We also evaluate the utility of a domain hierarchy and find that incorporating the hierarchical structure of a collection into the retrieval model could have a high impact on short query disambiguation.","PeriodicalId":173154,"journal":{"name":"Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"145 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Incorporating Hierarchical Domain Information to Disambiguate Very Short Queries\",\"authors\":\"Hamed Bonab, Mohammad Aliannejadi, John Foley, J. Allan\",\"doi\":\"10.1145/3341981.3344251\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Users often express their information needs using incomplete or ambiguous queries of only one or two terms in length, particularly in the Web environments. The ambiguity of short queries is a recognized problem for information retrieval (IR) systems. In this study, we investigate various approaches for incorporating hierarchical domain information into IR models such that the domain specification resolves the ambiguity. To this end, we develop practical models for constructing evaluation datasets from existing corpora. In terms of effectiveness, we further study the trade-off between a short query and its domain specification information. In doing so, we find that domains with the highest number of relevant documents are not always the best ones to select. We also evaluate the utility of a domain hierarchy and find that incorporating the hierarchical structure of a collection into the retrieval model could have a high impact on short query disambiguation.\",\"PeriodicalId\":173154,\"journal\":{\"name\":\"Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval\",\"volume\":\"145 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3341981.3344251\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3341981.3344251","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Incorporating Hierarchical Domain Information to Disambiguate Very Short Queries
Users often express their information needs using incomplete or ambiguous queries of only one or two terms in length, particularly in the Web environments. The ambiguity of short queries is a recognized problem for information retrieval (IR) systems. In this study, we investigate various approaches for incorporating hierarchical domain information into IR models such that the domain specification resolves the ambiguity. To this end, we develop practical models for constructing evaluation datasets from existing corpora. In terms of effectiveness, we further study the trade-off between a short query and its domain specification information. In doing so, we find that domains with the highest number of relevant documents are not always the best ones to select. We also evaluate the utility of a domain hierarchy and find that incorporating the hierarchical structure of a collection into the retrieval model could have a high impact on short query disambiguation.