基于马尔可夫场方面模型的区域分类

2007 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2007-06-17 DOI:10.1109/CVPR.2007.383098

J. Verbeek, B. Triggs

{"title":"基于马尔可夫场方面模型的区域分类","authors":"J. Verbeek, B. Triggs","doi":"10.1109/CVPR.2007.383098","DOIUrl":null,"url":null,"abstract":"Considerable advances have been made in learning to recognize and localize visual object classes. Simple bag-of-feature approaches label each pixel or patch independently. More advanced models attempt to improve the coherence of the labellings by introducing some form of inter-patch coupling: traditional spatial models such as MRF's provide crisper local labellings by exploiting neighbourhood-level couplings, while aspect models such as PLSA and LDA use global relevance estimates (global mixing proportions for the classes appearing in the image) to shape the local choices. We point out that the two approaches are complementary, combining them to produce aspect-based spatial field models that outperform both approaches. We study two spatial models: one based on averaging over forests of minimal spanning trees linking neighboring image regions, the other on an efficient chain-based Expectation Propagation method for regular 8-neighbor Markov random fields. The models can be trained using either patch-level labels or image-level keywords. As input features they use factored observation models combining texture, color and position cues. Experimental results on the MSR Cambridge data sets show that combining spatial and aspect models significantly improves the region-level classification accuracy. In fact our models trained with image-level labels outperform PLSA trained with pixel-level ones.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"248","resultStr":"{\"title\":\"Region Classification with Markov Field Aspect Models\",\"authors\":\"J. Verbeek, B. Triggs\",\"doi\":\"10.1109/CVPR.2007.383098\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Considerable advances have been made in learning to recognize and localize visual object classes. Simple bag-of-feature approaches label each pixel or patch independently. More advanced models attempt to improve the coherence of the labellings by introducing some form of inter-patch coupling: traditional spatial models such as MRF's provide crisper local labellings by exploiting neighbourhood-level couplings, while aspect models such as PLSA and LDA use global relevance estimates (global mixing proportions for the classes appearing in the image) to shape the local choices. We point out that the two approaches are complementary, combining them to produce aspect-based spatial field models that outperform both approaches. We study two spatial models: one based on averaging over forests of minimal spanning trees linking neighboring image regions, the other on an efficient chain-based Expectation Propagation method for regular 8-neighbor Markov random fields. The models can be trained using either patch-level labels or image-level keywords. As input features they use factored observation models combining texture, color and position cues. Experimental results on the MSR Cambridge data sets show that combining spatial and aspect models significantly improves the region-level classification accuracy. In fact our models trained with image-level labels outperform PLSA trained with pixel-level ones.\",\"PeriodicalId\":351008,\"journal\":{\"name\":\"2007 IEEE Conference on Computer Vision and Pattern Recognition\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"248\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE Conference on Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2007.383098\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2007.383098","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 248

摘要

在学习识别和定位视觉对象类方面已经取得了相当大的进展。简单的特征袋方法独立标记每个像素或patch。更先进的模型试图通过引入某种形式的斑块间耦合来提高标记的一致性:传统的空间模型(如MRF)通过利用邻域级耦合提供更清晰的局部标记，而方面模型(如PLSA和LDA)使用全局相关性估计(图像中出现的类的全局混合比例)来形成局部选择。我们指出，这两种方法是互补的，将它们结合起来可以产生优于两种方法的基于方面的空间场模型。我们研究了两种空间模型:一种是基于连接相邻图像区域的最小生成树的森林平均，另一种是基于正则8邻马尔可夫随机场的高效链期望传播方法。模型可以使用补丁级标签或图像级关键字进行训练。作为输入特征，他们使用结合纹理、颜色和位置线索的因子观察模型。在MSR剑桥数据集上的实验结果表明，结合空间模型和方面模型显著提高了区域级分类的精度。事实上，我们用图像级标签训练的模型优于用像素级标签训练的PLSA。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Region Classification with Markov Field Aspect Models

Considerable advances have been made in learning to recognize and localize visual object classes. Simple bag-of-feature approaches label each pixel or patch independently. More advanced models attempt to improve the coherence of the labellings by introducing some form of inter-patch coupling: traditional spatial models such as MRF's provide crisper local labellings by exploiting neighbourhood-level couplings, while aspect models such as PLSA and LDA use global relevance estimates (global mixing proportions for the classes appearing in the image) to shape the local choices. We point out that the two approaches are complementary, combining them to produce aspect-based spatial field models that outperform both approaches. We study two spatial models: one based on averaging over forests of minimal spanning trees linking neighboring image regions, the other on an efficient chain-based Expectation Propagation method for regular 8-neighbor Markov random fields. The models can be trained using either patch-level labels or image-level keywords. As input features they use factored observation models combining texture, color and position cues. Experimental results on the MSR Cambridge data sets show that combining spatial and aspect models significantly improves the region-level classification accuracy. In fact our models trained with image-level labels outperform PLSA trained with pixel-level ones.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2007 IEEE Conference on Computer Vision and Pattern Recognition

自引率

0.00%

发文量