{"title":"Towards automatic generation of query taxonomy: a hierarchical query clustering approach","authors":"Shui-Lung Chuang, Lee-Feng Chien","doi":"10.1109/ICDM.2002.1183888","DOIUrl":null,"url":null,"abstract":"Most previous work on automatic query clustering generated a flat, un-nested partition of query terms. In this work, we discuss the organization of query terms into a hierarchical structure and construct a query taxonomy in an automatic way. The proposed approach is designed based on a hierarchical agglomerative clustering algorithm to hierarchically group similar queries and generate cluster hierarchies using a novel cluster partition technique. The search processes of real-world search engines are combined to obtain highly ranked Web documents as the feature source for each query term. Preliminary experiments show that the proposed approach is effective for obtaining thesaurus information for query terms, and is also feasible for constructing a query taxonomy which provides a basis for in-depth analysis of users' search interests and domain-specific vocabulary on a larger scale.","PeriodicalId":405340,"journal":{"name":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"75","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2002.1183888","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 75
Abstract
Most previous work on automatic query clustering generated a flat, un-nested partition of query terms. In this work, we discuss the organization of query terms into a hierarchical structure and construct a query taxonomy in an automatic way. The proposed approach is designed based on a hierarchical agglomerative clustering algorithm to hierarchically group similar queries and generate cluster hierarchies using a novel cluster partition technique. The search processes of real-world search engines are combined to obtain highly ranked Web documents as the feature source for each query term. Preliminary experiments show that the proposed approach is effective for obtaining thesaurus information for query terms, and is also feasible for constructing a query taxonomy which provides a basis for in-depth analysis of users' search interests and domain-specific vocabulary on a larger scale.