{"title":"Concept-based Web Search using Domain Prediction and Parallel Query Expansion","authors":"Rahul Joshi, Y. Aslandogan","doi":"10.1109/IRI.2006.252407","DOIUrl":null,"url":null,"abstract":"We address the problem of irrelevant results for short queries on Web search engines using latent semantic indexing in the WordSpace model and query expansion. First, we predict the potential concept topics, which are the domains for the search terms. Next, we expand the search terms in each of the predicted domains in parallel. We then submit separate queries, specialized for each domain, to a general-purpose search engine. The user is presented with categorized search results under the predicted domains. We prepared a categorized text collection (corpus) using Web directory listing to build word association models. We compare the results obtained using this corpus with those using Reuters corpus. User evaluations indicate that our approach helps the users avoid having to examine irrelevant Web search results, especially with short queries","PeriodicalId":402255,"journal":{"name":"2006 IEEE International Conference on Information Reuse & Integration","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International Conference on Information Reuse & Integration","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2006.252407","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
We address the problem of irrelevant results for short queries on Web search engines using latent semantic indexing in the WordSpace model and query expansion. First, we predict the potential concept topics, which are the domains for the search terms. Next, we expand the search terms in each of the predicted domains in parallel. We then submit separate queries, specialized for each domain, to a general-purpose search engine. The user is presented with categorized search results under the predicted domains. We prepared a categorized text collection (corpus) using Web directory listing to build word association models. We compare the results obtained using this corpus with those using Reuters corpus. User evaluations indicate that our approach helps the users avoid having to examine irrelevant Web search results, especially with short queries