{"title":"Pushing the Accuracy of Thai Food Image Classification with Transfer Learning","authors":"Sirawit Ittisoponpisan, Chayanat Kaipan, Somporn Ruang-On, Rattayagon Thaiphan, Kritaphat Songsri-in","doi":"10.4186/ej.2022.26.10.57","DOIUrl":null,"url":null,"abstract":". Food image classification is a challenging problem, the solution of which can be of great benefit to many real-world applications such as nutrition and allergy estimation. Most of the previous studies proposed to use variations of convolutional neural networks to tackle the problem. However, due to the limited number of annotated food image datasets, there is still some room for improvement, especially in terms of accuracy and speed. Generally speaking, neural networks trained to solve image classification problems on a small dataset benefit from utilizing the weights of the networks that have been pre-trained on a large image classification dataset such as ImageNet. In this paper, we compare the trade-offs between training networks from scratch, deploying pre-trained networks as feature extractors, and fine-tuning the networks for Thai food image classification. By utilizing Transfer Learning with EfficientNetV1, we were able to achieve higher accuracy for Thai Food Image Classification on the largest publicly available Thai food image dataset, THFOOD-50. In particular, our proposed method improves upon the accuracy of the previous state-of-the-art method from 84.06% to 91.49% while maintaining the speed for the prediction at 103 ms and 1205 ms for GPU and CPU, respectively.","PeriodicalId":32885,"journal":{"name":"AlKhawarizmi Engineering Journal","volume":"17 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AlKhawarizmi Engineering Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4186/ej.2022.26.10.57","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
. Food image classification is a challenging problem, the solution of which can be of great benefit to many real-world applications such as nutrition and allergy estimation. Most of the previous studies proposed to use variations of convolutional neural networks to tackle the problem. However, due to the limited number of annotated food image datasets, there is still some room for improvement, especially in terms of accuracy and speed. Generally speaking, neural networks trained to solve image classification problems on a small dataset benefit from utilizing the weights of the networks that have been pre-trained on a large image classification dataset such as ImageNet. In this paper, we compare the trade-offs between training networks from scratch, deploying pre-trained networks as feature extractors, and fine-tuning the networks for Thai food image classification. By utilizing Transfer Learning with EfficientNetV1, we were able to achieve higher accuracy for Thai Food Image Classification on the largest publicly available Thai food image dataset, THFOOD-50. In particular, our proposed method improves upon the accuracy of the previous state-of-the-art method from 84.06% to 91.49% while maintaining the speed for the prediction at 103 ms and 1205 ms for GPU and CPU, respectively.