TEX-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI:10.1145/3078971.3079001

R. Anwer, F. Khan, Joost van de Weijer, Jorma T. Laaksonen

{"title":"TEX-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition","authors":"R. Anwer, F. Khan, Joost van de Weijer, Jorma T. Laaksonen","doi":"10.1145/3078971.3079001","DOIUrl":null,"url":null,"abstract":"Recognizing materials and textures in realistic imaging conditions is a challenging computer vision problem. For many years, local features based orderless representations were a dominant approach for texture recognition. Recently deep local features, extracted from the intermediate layers of a Convolutional Neural Network (CNN), are used as filter banks. These dense local descriptors from a deep model, when encoded with Fisher Vectors, have shown to provide excellent results for texture recognition. The CNN models, employed in such approaches, take RGB patches as input and train on a large amount of labeled images. We show that CNN models, which we call TEX-Nets, trained using mapped coded images with explicit texture information provide complementary information to the standard deep models trained on RGB patches. We further investigate two deep architectures, namely early and late fusion, to combine the texture and color information. Experiments on benchmark texture datasets clearly demonstrate that TEX-Nets provide complementary information to standard RGB deep network. Our approach provides a large gain of 4.8%, 3.5%, 2.6% and 4.1% respectively in accuracy on the DTD, KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets, compared to the standard RGB network of the same architecture. Further, our final combination leads to consistent improvements over the state-of-the-art on all four datasets.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3078971.3079001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

Recognizing materials and textures in realistic imaging conditions is a challenging computer vision problem. For many years, local features based orderless representations were a dominant approach for texture recognition. Recently deep local features, extracted from the intermediate layers of a Convolutional Neural Network (CNN), are used as filter banks. These dense local descriptors from a deep model, when encoded with Fisher Vectors, have shown to provide excellent results for texture recognition. The CNN models, employed in such approaches, take RGB patches as input and train on a large amount of labeled images. We show that CNN models, which we call TEX-Nets, trained using mapped coded images with explicit texture information provide complementary information to the standard deep models trained on RGB patches. We further investigate two deep architectures, namely early and late fusion, to combine the texture and color information. Experiments on benchmark texture datasets clearly demonstrate that TEX-Nets provide complementary information to standard RGB deep network. Our approach provides a large gain of 4.8%, 3.5%, 2.6% and 4.1% respectively in accuracy on the DTD, KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets, compared to the standard RGB network of the same architecture. Further, our final combination leads to consistent improvements over the state-of-the-art on all four datasets.

查看原文本刊更多论文

tex.net:用于纹理识别的二进制模式编码卷积神经网络

在真实成像条件下识别材料和纹理是一个具有挑战性的计算机视觉问题。多年来，基于局部特征的有序表示一直是纹理识别的主流方法。近年来，从卷积神经网络(CNN)的中间层中提取的深度局部特征被用作滤波器组。这些来自深度模型的密集局部描述符，当用Fisher向量编码时，已经证明为纹理识别提供了很好的结果。在这些方法中使用的CNN模型以RGB patch作为输入，在大量标记图像上进行训练。我们表明，使用带有明确纹理信息的映射编码图像训练的CNN模型(我们称之为TEX-Nets)为在RGB patch上训练的标准深度模型提供了补充信息。我们进一步研究了两种深度架构，即早期融合和晚期融合，以结合纹理和颜色信息。在基准纹理数据集上的实验清楚地表明，TEX-Nets与标准RGB深度网络提供了互补的信息。与相同架构的标准RGB网络相比，我们的方法在DTD、KTH-TIPS-2a、KTH-TIPS-2b和Texture-10数据集上的精度分别提高了4.8%、3.5%、2.6%和4.1%。此外，我们的最终组合导致了在所有四个数据集上的最先进的持续改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval

自引率

0.00%

发文量