Bilinear CNN Models for Food Recognition

2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2017-11-01 DOI:10.1109/DICTA.2017.8227411

Hesen Chen, Jingyu Wang, Q. Qi, Yujian Li, Haifeng Sun

引用次数: 12

Abstract

Due to the diversity of food types and the slight differences between different dishes, the genre of food images becomes a new challenge in the field of computer vision. To tackle this problem, recent efforts are focusing on designing hand-crafted features or extracting features automatically by using deep convolutional neural network. Although these methods have reported a series of success, their general architectures fail to capture the fine-grained features of similar dishes adequately. Inspired by the bilinear CNN models in the field of fine-grained classification, we have exploited such a similar structure in which two deep convolution networks are used as feature extractors and the outputs of them fused to obtain fine-grained features. These features are used to train the food classifier. We have conducted experiments on three publicly benchmark food datasets to evaluate the proposed architecture. The experiments exhibit that our method is comparable to the existing approaches.

查看原文本刊更多论文

用于食物识别的双线性CNN模型

由于食物种类的多样性和不同菜肴之间的细微差异，食物图像的类型成为计算机视觉领域的一个新的挑战。为了解决这一问题，最近的研究主要集中在手工设计特征或使用深度卷积神经网络自动提取特征上。尽管这些方法已经取得了一系列成功，但它们的总体架构无法充分捕捉类似菜肴的细粒度特征。受细粒度分类领域双线性CNN模型的启发，我们利用了这样一种类似的结构，即使用两个深度卷积网络作为特征提取器，并将它们的输出融合以获得细粒度特征。这些特征被用来训练食物分类器。我们在三个公开的基准食品数据集上进行了实验，以评估所提出的架构。实验表明，我们的方法与现有的方法相当。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA)

自引率

0.00%

发文量