Spatial codebooks for image categorization

Eugene Mbanya, S. Gerke, P. Ndjiki-Nya
{"title":"Spatial codebooks for image categorization","authors":"Eugene Mbanya, S. Gerke, P. Ndjiki-Nya","doi":"10.1145/1991996.1992046","DOIUrl":null,"url":null,"abstract":"Currently, bag-of-words approaches for image categorization are very popular due to their relative simplicity, robustness and high efficiency. However, they lack the ability to represent the spatial composition of an image. This drawback has been addressed by several approaches, with spatial pyramids being the most popular. Spatial pyramids divide an image into smaller blocks, resulting in a feature vector for each block of the image. The feature vectors for these blocks are concatenated to form the feature vector of the whole image. This leads to an increase in dimension of the whole image's feature vector by a factor corresponding to the number of blocks the image is divided into. Consequently, this causes an increase in computation time proportional to the number of blocks. We propose an extension of the image feature vector by spatial features, which results in a descriptor of similar size as in the standard bag-of-words approach. The classification performance however is similar to those of spatial pyramids which use a feature vector of significantly larger size and therefore are more computationally expensive.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1991996.1992046","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Currently, bag-of-words approaches for image categorization are very popular due to their relative simplicity, robustness and high efficiency. However, they lack the ability to represent the spatial composition of an image. This drawback has been addressed by several approaches, with spatial pyramids being the most popular. Spatial pyramids divide an image into smaller blocks, resulting in a feature vector for each block of the image. The feature vectors for these blocks are concatenated to form the feature vector of the whole image. This leads to an increase in dimension of the whole image's feature vector by a factor corresponding to the number of blocks the image is divided into. Consequently, this causes an increase in computation time proportional to the number of blocks. We propose an extension of the image feature vector by spatial features, which results in a descriptor of similar size as in the standard bag-of-words approach. The classification performance however is similar to those of spatial pyramids which use a feature vector of significantly larger size and therefore are more computationally expensive.
用于图像分类的空间码本
目前,词袋分类方法因其相对简单、鲁棒性和高效性而受到广泛应用。然而,它们缺乏表示图像空间构成的能力。有几种方法可以解决这个缺点,其中空间金字塔是最受欢迎的。空间金字塔将图像分成更小的块,从而为图像的每个块生成一个特征向量。将这些块的特征向量连接起来,形成整个图像的特征向量。这将导致整个图像特征向量的维度增加一个与图像被划分的块数量相对应的因子。因此,这会导致计算时间的增加与块的数量成正比。我们提出通过空间特征对图像特征向量进行扩展,从而得到与标准词袋方法相似大小的描述符。然而,其分类性能与空间金字塔相似,空间金字塔使用的特征向量明显更大,因此计算成本更高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信