Pushing the Accuracy of Thai Food Image Classification with Transfer Learning

AlKhawarizmi Engineering Journal Pub Date : 2022-10-31 DOI:10.4186/ej.2022.26.10.57

Sirawit Ittisoponpisan, Chayanat Kaipan, Somporn Ruang-On, Rattayagon Thaiphan, Kritaphat Songsri-in

{"title":"Pushing the Accuracy of Thai Food Image Classification with Transfer Learning","authors":"Sirawit Ittisoponpisan, Chayanat Kaipan, Somporn Ruang-On, Rattayagon Thaiphan, Kritaphat Songsri-in","doi":"10.4186/ej.2022.26.10.57","DOIUrl":null,"url":null,"abstract":". Food image classification is a challenging problem, the solution of which can be of great benefit to many real-world applications such as nutrition and allergy estimation. Most of the previous studies proposed to use variations of convolutional neural networks to tackle the problem. However, due to the limited number of annotated food image datasets, there is still some room for improvement, especially in terms of accuracy and speed. Generally speaking, neural networks trained to solve image classification problems on a small dataset benefit from utilizing the weights of the networks that have been pre-trained on a large image classification dataset such as ImageNet. In this paper, we compare the trade-offs between training networks from scratch, deploying pre-trained networks as feature extractors, and fine-tuning the networks for Thai food image classification. By utilizing Transfer Learning with EfficientNetV1, we were able to achieve higher accuracy for Thai Food Image Classification on the largest publicly available Thai food image dataset, THFOOD-50. In particular, our proposed method improves upon the accuracy of the previous state-of-the-art method from 84.06% to 91.49% while maintaining the speed for the prediction at 103 ms and 1205 ms for GPU and CPU, respectively.","PeriodicalId":32885,"journal":{"name":"AlKhawarizmi Engineering Journal","volume":"17 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AlKhawarizmi Engineering Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4186/ej.2022.26.10.57","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

. Food image classification is a challenging problem, the solution of which can be of great benefit to many real-world applications such as nutrition and allergy estimation. Most of the previous studies proposed to use variations of convolutional neural networks to tackle the problem. However, due to the limited number of annotated food image datasets, there is still some room for improvement, especially in terms of accuracy and speed. Generally speaking, neural networks trained to solve image classification problems on a small dataset benefit from utilizing the weights of the networks that have been pre-trained on a large image classification dataset such as ImageNet. In this paper, we compare the trade-offs between training networks from scratch, deploying pre-trained networks as feature extractors, and fine-tuning the networks for Thai food image classification. By utilizing Transfer Learning with EfficientNetV1, we were able to achieve higher accuracy for Thai Food Image Classification on the largest publicly available Thai food image dataset, THFOOD-50. In particular, our proposed method improves upon the accuracy of the previous state-of-the-art method from 84.06% to 91.49% while maintaining the speed for the prediction at 103 ms and 1205 ms for GPU and CPU, respectively.

查看原文本刊更多论文

利用迁移学习提高泰国食物图像分类的准确性

。食品图像分类是一个具有挑战性的问题，其解决方案可以为许多现实世界的应用带来很大的好处，如营养和过敏估计。以前的大多数研究都提出使用卷积神经网络的变体来解决这个问题。然而，由于标注食品图像数据集的数量有限，在准确性和速度方面仍有一定的提升空间。一般来说，在小数据集上训练解决图像分类问题的神经网络受益于利用在大型图像分类数据集(如ImageNet)上预训练的网络的权重。在本文中，我们比较了从头开始训练网络，部署预训练网络作为特征提取器，以及微调网络用于泰国食品图像分类之间的权衡。通过使用effentnetv1的迁移学习，我们能够在最大的公开泰国食品图像数据集THFOOD-50上实现更高的泰国食品图像分类精度。特别是，我们提出的方法将之前最先进的方法的准确率从84.06%提高到91.49%，同时在GPU和CPU上分别保持103 ms和1205 ms的预测速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

AlKhawarizmi Engineering Journal

自引率

0.00%

发文量

审稿时长

20 weeks