加速视觉词的分裂信息聚类

Jianjia Zhang, Lei Wang, Lingqiao Liu, Luping Zhou, W. Li
{"title":"加速视觉词的分裂信息聚类","authors":"Jianjia Zhang, Lei Wang, Lingqiao Liu, Luping Zhou, W. Li","doi":"10.1109/DICTA.2013.6691476","DOIUrl":null,"url":null,"abstract":"Word clustering is an effective approach in the bag- of-words model to reducing the dimensionality of high-dimensional features. In recent years, the bag- of-words model has been successfully introduced into visual recognition and significantly developed. Often, in order to adequately model the complex and diversified visual patterns, a large number of visual words are used, especially in the state-of- the-art visual recognition methods. As a result, the existing word clustering algorithms become not computationally efficient enough. They can considerably prolong the process such as model updating and parameter tuning, where word clustering needs to be repeatedly employed. In this paper, we focus on the divisive information-theoretic clustering, one of the most efficient word clustering algorithms in the field of text analysis, and accelerate its speed to better deal with a large number of visual words. We discuss the properties of its cluster membership evaluation function, KL- divergence, in both binary and multi-class classification cases and develop the accelerated versions in two different ways. Theoretical analysis shows that the proposed accelerated divisive information-theoretic clustering algorithm can handle a large number of visual words in a much more efficient manner. As demonstrated on the benchmark datasets in visual recognition, it can achieve speed-up by hundreds of times while well maintaining the clustering performance of the original algorithm.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"123 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accelerating the Divisive Information-Theoretic Clustering of Visual Words\",\"authors\":\"Jianjia Zhang, Lei Wang, Lingqiao Liu, Luping Zhou, W. Li\",\"doi\":\"10.1109/DICTA.2013.6691476\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Word clustering is an effective approach in the bag- of-words model to reducing the dimensionality of high-dimensional features. In recent years, the bag- of-words model has been successfully introduced into visual recognition and significantly developed. Often, in order to adequately model the complex and diversified visual patterns, a large number of visual words are used, especially in the state-of- the-art visual recognition methods. As a result, the existing word clustering algorithms become not computationally efficient enough. They can considerably prolong the process such as model updating and parameter tuning, where word clustering needs to be repeatedly employed. In this paper, we focus on the divisive information-theoretic clustering, one of the most efficient word clustering algorithms in the field of text analysis, and accelerate its speed to better deal with a large number of visual words. We discuss the properties of its cluster membership evaluation function, KL- divergence, in both binary and multi-class classification cases and develop the accelerated versions in two different ways. Theoretical analysis shows that the proposed accelerated divisive information-theoretic clustering algorithm can handle a large number of visual words in a much more efficient manner. As demonstrated on the benchmark datasets in visual recognition, it can achieve speed-up by hundreds of times while well maintaining the clustering performance of the original algorithm.\",\"PeriodicalId\":231632,\"journal\":{\"name\":\"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)\",\"volume\":\"123 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DICTA.2013.6691476\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA.2013.6691476","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

词聚类是词袋模型中一种有效的降低高维特征维数的方法。近年来,词袋模型被成功地引入视觉识别领域,并取得了长足的发展。通常,为了充分模拟复杂多样的视觉模式,需要使用大量的视觉词,特别是在目前最先进的视觉识别方法中。因此,现有的词聚类算法的计算效率不够高。它们可以大大延长模型更新和参数调优等过程,在这些过程中需要重复使用词聚类。本文重点研究了文本分析领域中最有效的词聚类算法之一——分裂信息聚类算法,并加快了其速度,以更好地处理大量的视觉词。讨论了二元分类和多类分类情况下的聚类隶属度评价函数KL-散度的性质,并以两种不同的方式开发了加速版本。理论分析表明,所提出的加速分裂信息聚类算法可以更有效地处理大量视觉词。在视觉识别的基准数据集上证明,该算法在保持原算法聚类性能的同时,可以实现数百倍的提速。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Accelerating the Divisive Information-Theoretic Clustering of Visual Words
Word clustering is an effective approach in the bag- of-words model to reducing the dimensionality of high-dimensional features. In recent years, the bag- of-words model has been successfully introduced into visual recognition and significantly developed. Often, in order to adequately model the complex and diversified visual patterns, a large number of visual words are used, especially in the state-of- the-art visual recognition methods. As a result, the existing word clustering algorithms become not computationally efficient enough. They can considerably prolong the process such as model updating and parameter tuning, where word clustering needs to be repeatedly employed. In this paper, we focus on the divisive information-theoretic clustering, one of the most efficient word clustering algorithms in the field of text analysis, and accelerate its speed to better deal with a large number of visual words. We discuss the properties of its cluster membership evaluation function, KL- divergence, in both binary and multi-class classification cases and develop the accelerated versions in two different ways. Theoretical analysis shows that the proposed accelerated divisive information-theoretic clustering algorithm can handle a large number of visual words in a much more efficient manner. As demonstrated on the benchmark datasets in visual recognition, it can achieve speed-up by hundreds of times while well maintaining the clustering performance of the original algorithm.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信