{"title":"Food Recognition: Can Deep Learning or Bag-of-Words Match Humans?","authors":"P. Furtado","doi":"10.5220/0008893301020108","DOIUrl":null,"url":null,"abstract":": Automated smartphone-based food recognition is a useful basis for applications targeted at dietary assessment. Dish recognition is a necessary step in that process. One of the possible approaches to use is deep learning-based recognition, another one is bag-of-words based classification. Deep learning has increasingly become the preferred approach to use in either this or other image classification tasks. Additionally, if humans are better recognizing the dish, the automated approach is useless (it will be less error-prone for the user to identify the dish instead of capturing the photo). We compare the alternatives of Deep Learning (DL), Bag-of-words (BoW) and Humans (H). The best deep learner beats humans when on few food categories, but looses if it has to learn many more food categories, which is expected in real contexts. We describe the approaches, analyze the results, draw conclusions and design further work to evaluate further and improve the approaches.","PeriodicalId":162397,"journal":{"name":"Bioimaging (Bristol. Print)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioimaging (Bristol. Print)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0008893301020108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
: Automated smartphone-based food recognition is a useful basis for applications targeted at dietary assessment. Dish recognition is a necessary step in that process. One of the possible approaches to use is deep learning-based recognition, another one is bag-of-words based classification. Deep learning has increasingly become the preferred approach to use in either this or other image classification tasks. Additionally, if humans are better recognizing the dish, the automated approach is useless (it will be less error-prone for the user to identify the dish instead of capturing the photo). We compare the alternatives of Deep Learning (DL), Bag-of-words (BoW) and Humans (H). The best deep learner beats humans when on few food categories, but looses if it has to learn many more food categories, which is expected in real contexts. We describe the approaches, analyze the results, draw conclusions and design further work to evaluate further and improve the approaches.