{"title":"An Incremental Algorithm for Clustering Search Results","authors":"Yongli Liu, Y. Ouyang, Hao Sheng, Z. Xiong","doi":"10.1109/SITIS.2008.53","DOIUrl":null,"url":null,"abstract":"When Internet users are facing massive search results, document clustering techniques are very helpful. Generally, existing clustering methods start with a known set of data objects, measured against a known set of attributes. However, there are numerous applications where the attribute set can only obtained gradually as processing data objects incrementally. This paper presents an incremental clustering algorithm (ICA) for clustering search results, which relies on pair-wise search result similarity calculated using Jaccard method. We use a measure namely, cluster average similarity area to score cluster cohesiveness. Experimental results show that our algorithm leads to less computational time than traditional clustering method while achieving a comparable or better clustering quality.","PeriodicalId":202698,"journal":{"name":"2008 IEEE International Conference on Signal Image Technology and Internet Based Systems","volume":"218 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Conference on Signal Image Technology and Internet Based Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SITIS.2008.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
When Internet users are facing massive search results, document clustering techniques are very helpful. Generally, existing clustering methods start with a known set of data objects, measured against a known set of attributes. However, there are numerous applications where the attribute set can only obtained gradually as processing data objects incrementally. This paper presents an incremental clustering algorithm (ICA) for clustering search results, which relies on pair-wise search result similarity calculated using Jaccard method. We use a measure namely, cluster average similarity area to score cluster cohesiveness. Experimental results show that our algorithm leads to less computational time than traditional clustering method while achieving a comparable or better clustering quality.