Prediction of Snacking Behavior Involving Snacks Having High Levels of Saturated Fats, Salt, or Sugar Using Only Information on Previous Instances of Snacking: Survey- and App-Based Study.
Shaima Dammas, Tillman Weyde, Katy Tapper, Gerasimos Spanakis, Anne Roefs, Emmanuel M Pothos
{"title":"Prediction of Snacking Behavior Involving Snacks Having High Levels of Saturated Fats, Salt, or Sugar Using Only Information on Previous Instances of Snacking: Survey- and App-Based Study.","authors":"Shaima Dammas, Tillman Weyde, Katy Tapper, Gerasimos Spanakis, Anne Roefs, Emmanuel M Pothos","doi":"10.2196/57530","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Consuming high amounts of foods or beverages with high levels of saturated fats, salt, or sugar (HFSS) can be harmful for health. Many snacks fall into this category (HFSS snacks). However, the palatability of these snacks means that people can sometimes struggle to reduce their intake. Machine learning algorithms could help in predicting the likely occurrence of HFSS snacking so that just-in-time adaptive interventions can be deployed. However, HFSS snacking data have certain characteristics, such as sparseness and incompleteness, which make snacking prediction a challenge for machine learning approaches. Previous attempts have employed several potential predictor variables and have achieved considerable success. Nevertheless, collecting information from several dimensions requires several potentially burdensome user questionnaires, and thus, this approach may be less acceptable for the general public.</p><p><strong>Objective: </strong>Our aim was to consider the capacity of standard (unmodified in any way; to tailor to the specific learning problem) machine learning algorithms to predict HFSS snacking based on the following minimal data that can be collected in a mostly automated way: day of the week, time of the day (divided into time bins), and location (divided into work, home, and other).</p><p><strong>Methods: </strong>A total of 111 participants in the United Kingdom were asked to record HFSS snacking occurrences and the location category over a period of 28 days, and this was considered the UK dataset. Data collection was facilitated by a purpose-specific app (Snack Tracker). Additionally, a similar dataset from the Netherlands was used (Dutch dataset). Both datasets were analyzed using machine learning methods, including random forest regressor, Extreme Gradient Boosting regressor, feed forward neural network, and long short-term memory. We additionally employed 2 baseline statistical models for prediction. In all cases, the prediction problem was the time to the next HFSS snack from the current one, and the evaluation metric was the mean absolute error.</p><p><strong>Results: </strong>The ability of machine learning methods to predict the time of the next HFSS snack was assessed. The quality of the prediction depended on the dataset, temporal resolution, and machine learning algorithm employed. In some cases, predictions were accurate to as low as 17 minutes on average. In general, machine learning methods outperformed the baseline models, but no machine learning method was clearly better than the others. Feed forward neural network showed a very marginal advantage.</p><p><strong>Conclusions: </strong>The prediction of HFSS snacking using sparse data is possible with reasonable accuracy. Our findings offer a foundation for further exploring how machine learning methods can be used in health psychology and provide directions for further research.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e57530"},"PeriodicalIF":3.1000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12059507/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/57530","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Consuming high amounts of foods or beverages with high levels of saturated fats, salt, or sugar (HFSS) can be harmful for health. Many snacks fall into this category (HFSS snacks). However, the palatability of these snacks means that people can sometimes struggle to reduce their intake. Machine learning algorithms could help in predicting the likely occurrence of HFSS snacking so that just-in-time adaptive interventions can be deployed. However, HFSS snacking data have certain characteristics, such as sparseness and incompleteness, which make snacking prediction a challenge for machine learning approaches. Previous attempts have employed several potential predictor variables and have achieved considerable success. Nevertheless, collecting information from several dimensions requires several potentially burdensome user questionnaires, and thus, this approach may be less acceptable for the general public.
Objective: Our aim was to consider the capacity of standard (unmodified in any way; to tailor to the specific learning problem) machine learning algorithms to predict HFSS snacking based on the following minimal data that can be collected in a mostly automated way: day of the week, time of the day (divided into time bins), and location (divided into work, home, and other).
Methods: A total of 111 participants in the United Kingdom were asked to record HFSS snacking occurrences and the location category over a period of 28 days, and this was considered the UK dataset. Data collection was facilitated by a purpose-specific app (Snack Tracker). Additionally, a similar dataset from the Netherlands was used (Dutch dataset). Both datasets were analyzed using machine learning methods, including random forest regressor, Extreme Gradient Boosting regressor, feed forward neural network, and long short-term memory. We additionally employed 2 baseline statistical models for prediction. In all cases, the prediction problem was the time to the next HFSS snack from the current one, and the evaluation metric was the mean absolute error.
Results: The ability of machine learning methods to predict the time of the next HFSS snack was assessed. The quality of the prediction depended on the dataset, temporal resolution, and machine learning algorithm employed. In some cases, predictions were accurate to as low as 17 minutes on average. In general, machine learning methods outperformed the baseline models, but no machine learning method was clearly better than the others. Feed forward neural network showed a very marginal advantage.
Conclusions: The prediction of HFSS snacking using sparse data is possible with reasonable accuracy. Our findings offer a foundation for further exploring how machine learning methods can be used in health psychology and provide directions for further research.
期刊介绍:
JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals.
Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.