Automated estimation of food type and amount consumed from body-worn audio and motion sensors

Mark Mirtchouk, Christopher A. Merck, Samantha Kleinberg
{"title":"Automated estimation of food type and amount consumed from body-worn audio and motion sensors","authors":"Mark Mirtchouk, Christopher A. Merck, Samantha Kleinberg","doi":"10.1145/2971648.2971677","DOIUrl":null,"url":null,"abstract":"Determining when an individual is eating can be useful for tracking behavior and identifying patterns, but to create nutrition logs automatically or provide real-time feedback to people with chronic disease, we need to identify both what they are consuming and in what quantity. However, food type and amount have mainly been estimated using image data (requiring user involvement) or acoustic sensors (tested with a restricted set of foods rather than representative meals). As a result, there is not yet a highly accurate automated nutrition monitoring method that can be used with a variety of foods. We propose that multi-modal sensing (in-ear audio plus head and wrist motion) can be used to more accurately classify food type, as audio and motion features provide complementary information. Further, we propose that knowing food type is critical for estimating amount consumed in combination with sensor data. To test this we use data from people wearing audio and motion sensors, with ground truth annotated from video and continuous scale data. With data from 40 unique foods we achieve a classification accuracy of 82.7% with a combination of sensors (versus 67.8% for audio alone and 76.2% for head and wrist motion). Weight estimation error was reduced from a baseline of 127.3% to 35.4% absolute relative error. Ultimately, our estimates of food type and amount can be linked to food databases to provide automated calorie estimates from continuously-collected data.","PeriodicalId":303792,"journal":{"name":"Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"94","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2971648.2971677","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 94

Abstract

Determining when an individual is eating can be useful for tracking behavior and identifying patterns, but to create nutrition logs automatically or provide real-time feedback to people with chronic disease, we need to identify both what they are consuming and in what quantity. However, food type and amount have mainly been estimated using image data (requiring user involvement) or acoustic sensors (tested with a restricted set of foods rather than representative meals). As a result, there is not yet a highly accurate automated nutrition monitoring method that can be used with a variety of foods. We propose that multi-modal sensing (in-ear audio plus head and wrist motion) can be used to more accurately classify food type, as audio and motion features provide complementary information. Further, we propose that knowing food type is critical for estimating amount consumed in combination with sensor data. To test this we use data from people wearing audio and motion sensors, with ground truth annotated from video and continuous scale data. With data from 40 unique foods we achieve a classification accuracy of 82.7% with a combination of sensors (versus 67.8% for audio alone and 76.2% for head and wrist motion). Weight estimation error was reduced from a baseline of 127.3% to 35.4% absolute relative error. Ultimately, our estimates of food type and amount can be linked to food databases to provide automated calorie estimates from continuously-collected data.
通过佩戴的声音和运动传感器自动估计食物类型和消耗的数量
确定一个人什么时候吃东西对于跟踪行为和识别模式很有用,但要自动创建营养日志或向慢性病患者提供实时反馈,我们需要确定他们吃了什么,吃了多少。然而,食物种类和数量主要是使用图像数据(需要用户参与)或声学传感器(用一组有限的食物而不是代表性食物进行测试)来估计的。因此,目前还没有一种高度精确的自动化营养监测方法,可以用于各种食物。我们建议使用多模态传感(入耳音频加上头部和手腕运动)来更准确地分类食物类型,因为音频和运动特征提供了互补的信息。此外,我们建议了解食物类型对于结合传感器数据估计消耗量至关重要。为了验证这一点,我们使用了佩戴音频和运动传感器的人的数据,以及从视频和连续尺度数据中注释的地面真相。使用来自40种独特食物的数据,我们在结合传感器的情况下实现了82.7%的分类准确率(相比之下,单独使用音频的准确率为67.8%,头部和手腕运动的准确率为76.2%)。绝对相对误差从基线的127.3%降至35.4%。最终,我们对食物种类和数量的估计可以与食物数据库联系起来,从持续收集的数据中提供自动的卡路里估计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信