视觉分类的马氏编码

Q1 Computer Science
Tomoki Matsuzawa, Raissa Relator, Wataru Takei, S. Omachi, Tsuyoshi Kato
{"title":"视觉分类的马氏编码","authors":"Tomoki Matsuzawa, Raissa Relator, Wataru Takei, S. Omachi, Tsuyoshi Kato","doi":"10.2197/ipsjtcva.7.69","DOIUrl":null,"url":null,"abstract":"Nowadays, the design of the representation of images is one of the most crucial factors in the performance of visual categorization. A common pipeline employed in most of recent researches for obtaining an image representa- tion consists of two steps: the encoding step and the pooling step. In this paper, we introduce the Mahalanobis metric to the two popular image patch encoding modules, Histogram Encoding and Fisher Encoding, that are used for Bag- of-Visual-Word method and Fisher Vector method, respectively. Moreover, for the proposed Fisher Vector method, a close-form approximation of Fisher Vector can be derived with the same assumption used in the original Fisher Vector, and the codebook is built without resorting to time-consuming EM (Expectation-Maximization) steps. Experimental evaluation of multi-class classification demonstrates the effectiveness of the proposed encoding methods.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"7 1","pages":"69-73"},"PeriodicalIF":0.0000,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Mahalanobis Encodings for Visual Categorization\",\"authors\":\"Tomoki Matsuzawa, Raissa Relator, Wataru Takei, S. Omachi, Tsuyoshi Kato\",\"doi\":\"10.2197/ipsjtcva.7.69\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, the design of the representation of images is one of the most crucial factors in the performance of visual categorization. A common pipeline employed in most of recent researches for obtaining an image representa- tion consists of two steps: the encoding step and the pooling step. In this paper, we introduce the Mahalanobis metric to the two popular image patch encoding modules, Histogram Encoding and Fisher Encoding, that are used for Bag- of-Visual-Word method and Fisher Vector method, respectively. Moreover, for the proposed Fisher Vector method, a close-form approximation of Fisher Vector can be derived with the same assumption used in the original Fisher Vector, and the codebook is built without resorting to time-consuming EM (Expectation-Maximization) steps. Experimental evaluation of multi-class classification demonstrates the effectiveness of the proposed encoding methods.\",\"PeriodicalId\":38957,\"journal\":{\"name\":\"IPSJ Transactions on Computer Vision and Applications\",\"volume\":\"7 1\",\"pages\":\"69-73\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IPSJ Transactions on Computer Vision and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2197/ipsjtcva.7.69\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IPSJ Transactions on Computer Vision and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2197/ipsjtcva.7.69","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 3

摘要

目前,图像表征的设计是影响视觉分类效果的关键因素之一。在最近的研究中,用于获取图像表示的常见管道包括两个步骤:编码步骤和池化步骤。本文将Mahalanobis度量引入到两种流行的图像patch编码模块中,即直方图编码和Fisher编码,这两种编码模块分别用于Bag- of- visual word法和Fisher矢量法。此外,对于所提出的Fisher向量方法,可以使用与原始Fisher向量相同的假设推导出Fisher向量的近似形式,并且无需使用耗时的EM(期望最大化)步骤构建码本。多类分类的实验评价证明了所提编码方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Mahalanobis Encodings for Visual Categorization
Nowadays, the design of the representation of images is one of the most crucial factors in the performance of visual categorization. A common pipeline employed in most of recent researches for obtaining an image representa- tion consists of two steps: the encoding step and the pooling step. In this paper, we introduce the Mahalanobis metric to the two popular image patch encoding modules, Histogram Encoding and Fisher Encoding, that are used for Bag- of-Visual-Word method and Fisher Vector method, respectively. Moreover, for the proposed Fisher Vector method, a close-form approximation of Fisher Vector can be derived with the same assumption used in the original Fisher Vector, and the codebook is built without resorting to time-consuming EM (Expectation-Maximization) steps. Experimental evaluation of multi-class classification demonstrates the effectiveness of the proposed encoding methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IPSJ Transactions on Computer Vision and Applications
IPSJ Transactions on Computer Vision and Applications Computer Science-Computer Vision and Pattern Recognition
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信