Prediction of Snacking Behavior Involving Snacks Having High Levels of Saturated Fats, Salt, or Sugar Using Only Information on Previous Instances of Snacking: Survey- and App-Based Study.

IF 3.1 3区医学 Q2 MEDICAL INFORMATICS

JMIR Medical Informatics Pub Date : 2025-04-23 DOI:10.2196/57530

Shaima Dammas, Tillman Weyde, Katy Tapper, Gerasimos Spanakis, Anne Roefs, Emmanuel M Pothos

{"title":"Prediction of Snacking Behavior Involving Snacks Having High Levels of Saturated Fats, Salt, or Sugar Using Only Information on Previous Instances of Snacking: Survey- and App-Based Study.","authors":"Shaima Dammas, Tillman Weyde, Katy Tapper, Gerasimos Spanakis, Anne Roefs, Emmanuel M Pothos","doi":"10.2196/57530","DOIUrl":null,"url":null,"abstract":"Background: Consuming high amounts of foods or beverages with high levels of saturated fats, salt, or sugar (HFSS) can be harmful for health. Many snacks fall into this category (HFSS snacks). However, the palatability of these snacks means that people can sometimes struggle to reduce their intake. Machine learning algorithms could help in predicting the likely occurrence of HFSS snacking so that just-in-time adaptive interventions can be deployed. However, HFSS snacking data have certain characteristics, such as sparseness and incompleteness, which make snacking prediction a challenge for machine learning approaches. Previous attempts have employed several potential predictor variables and have achieved considerable success. Nevertheless, collecting information from several dimensions requires several potentially burdensome user questionnaires, and thus, this approach may be less acceptable for the general public.Objective: Our aim was to consider the capacity of standard (unmodified in any way; to tailor to the specific learning problem) machine learning algorithms to predict HFSS snacking based on the following minimal data that can be collected in a mostly automated way: day of the week, time of the day (divided into time bins), and location (divided into work, home, and other).Methods: A total of 111 participants in the United Kingdom were asked to record HFSS snacking occurrences and the location category over a period of 28 days, and this was considered the UK dataset. Data collection was facilitated by a purpose-specific app (Snack Tracker). Additionally, a similar dataset from the Netherlands was used (Dutch dataset). Both datasets were analyzed using machine learning methods, including random forest regressor, Extreme Gradient Boosting regressor, feed forward neural network, and long short-term memory. We additionally employed 2 baseline statistical models for prediction. In all cases, the prediction problem was the time to the next HFSS snack from the current one, and the evaluation metric was the mean absolute error.Results: The ability of machine learning methods to predict the time of the next HFSS snack was assessed. The quality of the prediction depended on the dataset, temporal resolution, and machine learning algorithm employed. In some cases, predictions were accurate to as low as 17 minutes on average. In general, machine learning methods outperformed the baseline models, but no machine learning method was clearly better than the others. Feed forward neural network showed a very marginal advantage.Conclusions: The prediction of HFSS snacking using sparse data is possible with reasonable accuracy. Our findings offer a foundation for further exploring how machine learning methods can be used in health psychology and provide directions for further research.","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e57530"},"PeriodicalIF":3.1000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12059507/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/57530","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Consuming high amounts of foods or beverages with high levels of saturated fats, salt, or sugar (HFSS) can be harmful for health. Many snacks fall into this category (HFSS snacks). However, the palatability of these snacks means that people can sometimes struggle to reduce their intake. Machine learning algorithms could help in predicting the likely occurrence of HFSS snacking so that just-in-time adaptive interventions can be deployed. However, HFSS snacking data have certain characteristics, such as sparseness and incompleteness, which make snacking prediction a challenge for machine learning approaches. Previous attempts have employed several potential predictor variables and have achieved considerable success. Nevertheless, collecting information from several dimensions requires several potentially burdensome user questionnaires, and thus, this approach may be less acceptable for the general public.

Objective: Our aim was to consider the capacity of standard (unmodified in any way; to tailor to the specific learning problem) machine learning algorithms to predict HFSS snacking based on the following minimal data that can be collected in a mostly automated way: day of the week, time of the day (divided into time bins), and location (divided into work, home, and other).

Methods: A total of 111 participants in the United Kingdom were asked to record HFSS snacking occurrences and the location category over a period of 28 days, and this was considered the UK dataset. Data collection was facilitated by a purpose-specific app (Snack Tracker). Additionally, a similar dataset from the Netherlands was used (Dutch dataset). Both datasets were analyzed using machine learning methods, including random forest regressor, Extreme Gradient Boosting regressor, feed forward neural network, and long short-term memory. We additionally employed 2 baseline statistical models for prediction. In all cases, the prediction problem was the time to the next HFSS snack from the current one, and the evaluation metric was the mean absolute error.

Results: The ability of machine learning methods to predict the time of the next HFSS snack was assessed. The quality of the prediction depended on the dataset, temporal resolution, and machine learning algorithm employed. In some cases, predictions were accurate to as low as 17 minutes on average. In general, machine learning methods outperformed the baseline models, but no machine learning method was clearly better than the others. Feed forward neural network showed a very marginal advantage.

Conclusions: The prediction of HFSS snacking using sparse data is possible with reasonable accuracy. Our findings offer a foundation for further exploring how machine learning methods can be used in health psychology and provide directions for further research.

查看原文本刊更多论文

仅使用先前零食实例的信息预测含有高水平饱和脂肪、盐或糖的零食的零食行为：基于调查和应用程序的研究。

背景：摄入大量饱和脂肪、盐或糖（HFSS）含量高的食物或饮料可能对健康有害。许多零食都属于这一类（HFSS零食）。然而，这些零食的美味意味着人们有时很难减少它们的摄入量。机器学习算法可以帮助预测HFSS零食可能发生的情况，以便及时部署适应性干预措施。然而，HFSS零食数据具有一定的特征，如稀疏性和不完备性，这使得零食预测成为机器学习方法的挑战。以前的尝试使用了几个潜在的预测变量，并取得了相当大的成功。然而，从几个维度收集信息需要几个潜在的繁重的用户问卷，因此，这种方法可能不太为公众所接受。目的：我们的目的是考虑标准(未经任何方式修改；（根据具体的学习问题进行定制）机器学习算法，基于以下可以以大多数自动化方式收集的最小数据来预测HFSS零食：一周中的一天，一天中的时间（分为时间箱）和位置（分为工作，家庭和其他）。方法：英国共有111名参与者被要求在28天内记录HFSS零食的发生情况和地点类别，这被认为是英国的数据集。数据收集是由一个特定目的的应用程序（零食跟踪器）促进的。此外，还使用了来自荷兰的类似数据集（荷兰数据集）。使用机器学习方法对两个数据集进行分析，包括随机森林回归器、极端梯度增强回归器、前馈神经网络和长短期记忆。我们还采用了2个基线统计模型进行预测。在所有情况下，预测问题都是当前一个零食到下一个零食的时间，评估指标是平均绝对误差。结果：评估了机器学习方法预测下一个HFSS零食时间的能力。预测的质量取决于数据集、时间分辨率和所采用的机器学习算法。在某些情况下，预测的准确率平均低至17分钟。一般来说，机器学习方法优于基线模型，但没有机器学习方法明显优于其他方法。前馈神经网络表现出非常微弱的优势。结论：利用稀疏数据预测HFSS零食是可行的，具有合理的精度。我们的发现为进一步探索机器学习方法在健康心理学中的应用提供了基础，并为进一步的研究提供了方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JMIR Medical Informatics Medicine-Health Informatics

CiteScore

7.90

自引率

3.10%

发文量

173

审稿时长

12 weeks

期刊介绍： JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals. Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.