Gavin Long , Georgiana Nica-Avram , John Harvey , Evgeniya Lukinova , Roberto Mansilla , Simon Welham , Gregor Engelmann , Elizabeth Dolan , Kuzivakwashe Makokoro , Michelle Thomas , Edward Powell , James Goulding
{"title":"对全国购物数据的机器学习可靠地估计了儿童肥胖的流行程度和社会经济剥夺","authors":"Gavin Long , Georgiana Nica-Avram , John Harvey , Evgeniya Lukinova , Roberto Mansilla , Simon Welham , Gregor Engelmann , Elizabeth Dolan , Kuzivakwashe Makokoro , Michelle Thomas , Edward Powell , James Goulding","doi":"10.1016/j.foodpol.2025.102826","DOIUrl":null,"url":null,"abstract":"<div><div>Deprivation pushes people to choose cheap, calorie-dense foods instead of nutritious but expensive alternatives. Diseases, such as obesity, cardiovascular disease, and diabetes, resulting from these poor dietary choices place a significant burden on public health systems. Measuring nutritional insecurity is difficult to achieve at scale and so the ability to study the relationship between nutritional outcomes and deprivation at a national level is very challenging. This makes it difficult to understand the effect of new policies or track changes over time. To address this challenge, we develop a machine learning approach using massive anonymised transactional data (4 million members and 2.5 billion transactions) in partnership with the retailer The Co-operative Group UK. We engineer a series of variables related to obesogenic diets, including a new measure called ‘Calorie-oriented purchasing’. These variables help illustrate how large-scale transactional data can discriminate between neighbourhoods most affected by deprivation and childhood obesity. Through comparative assessment of machine learning approaches, we find better performance from tree-based models (Random Forest, XGBoost) with the best-achieving accuracy of 0.88 for predicting deprivation and an accuracy of 0.79 for childhood obesity. Calorie-oriented purchasing emerges as a robust predictor of deprivation and childhood obesity at the census area level. Results show this approach can help summarise nutritional insecurity, and support its spatio-temporal monitoring. We conclude with policy implications and recommend retailers adopt new measures for measuring national nutrition insecurity.</div></div>","PeriodicalId":321,"journal":{"name":"Food Policy","volume":"131 ","pages":"Article 102826"},"PeriodicalIF":6.8000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning on national shopping data reliably estimates childhood obesity prevalence and socio-economic deprivation\",\"authors\":\"Gavin Long , Georgiana Nica-Avram , John Harvey , Evgeniya Lukinova , Roberto Mansilla , Simon Welham , Gregor Engelmann , Elizabeth Dolan , Kuzivakwashe Makokoro , Michelle Thomas , Edward Powell , James Goulding\",\"doi\":\"10.1016/j.foodpol.2025.102826\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Deprivation pushes people to choose cheap, calorie-dense foods instead of nutritious but expensive alternatives. Diseases, such as obesity, cardiovascular disease, and diabetes, resulting from these poor dietary choices place a significant burden on public health systems. Measuring nutritional insecurity is difficult to achieve at scale and so the ability to study the relationship between nutritional outcomes and deprivation at a national level is very challenging. This makes it difficult to understand the effect of new policies or track changes over time. To address this challenge, we develop a machine learning approach using massive anonymised transactional data (4 million members and 2.5 billion transactions) in partnership with the retailer The Co-operative Group UK. We engineer a series of variables related to obesogenic diets, including a new measure called ‘Calorie-oriented purchasing’. These variables help illustrate how large-scale transactional data can discriminate between neighbourhoods most affected by deprivation and childhood obesity. Through comparative assessment of machine learning approaches, we find better performance from tree-based models (Random Forest, XGBoost) with the best-achieving accuracy of 0.88 for predicting deprivation and an accuracy of 0.79 for childhood obesity. Calorie-oriented purchasing emerges as a robust predictor of deprivation and childhood obesity at the census area level. Results show this approach can help summarise nutritional insecurity, and support its spatio-temporal monitoring. We conclude with policy implications and recommend retailers adopt new measures for measuring national nutrition insecurity.</div></div>\",\"PeriodicalId\":321,\"journal\":{\"name\":\"Food Policy\",\"volume\":\"131 \",\"pages\":\"Article 102826\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Food Policy\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306919225000302\",\"RegionNum\":1,\"RegionCategory\":\"经济学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURAL ECONOMICS & POLICY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Food Policy","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306919225000302","RegionNum":1,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ECONOMICS & POLICY","Score":null,"Total":0}
引用次数: 0
摘要
贫困促使人们选择廉价、高热量的食物,而不是有营养但昂贵的替代品。这些不良饮食选择导致的疾病,如肥胖、心血管疾病和糖尿病,给公共卫生系统带来了沉重负担。衡量营养不安全状况很难大规模实现,因此在国家层面上研究营养结果与贫困之间关系的能力非常具有挑战性。这使得很难理解新政策的影响或跟踪变化。为了应对这一挑战,我们与零售商the Co-operative Group UK合作,开发了一种使用大量匿名交易数据(400万会员和25亿笔交易)的机器学习方法。我们设计了一系列与肥胖饮食相关的变量,包括一项名为“卡路里导向购买”的新措施。这些变量有助于说明大规模交易数据如何区分受贫困和儿童肥胖影响最严重的社区。通过对机器学习方法的比较评估,我们发现基于树的模型(Random Forest, XGBoost)的性能更好,预测剥夺的最佳准确率为0.88,预测儿童肥胖的准确率为0.79。以卡路里为导向的消费在人口普查区水平上成为贫困和儿童肥胖的有力预测指标。结果表明,该方法有助于总结营养不安全状况,并支持其时空监测。我们总结了政策影响,并建议零售商采用新的措施来衡量国家营养不安全。
Machine learning on national shopping data reliably estimates childhood obesity prevalence and socio-economic deprivation
Deprivation pushes people to choose cheap, calorie-dense foods instead of nutritious but expensive alternatives. Diseases, such as obesity, cardiovascular disease, and diabetes, resulting from these poor dietary choices place a significant burden on public health systems. Measuring nutritional insecurity is difficult to achieve at scale and so the ability to study the relationship between nutritional outcomes and deprivation at a national level is very challenging. This makes it difficult to understand the effect of new policies or track changes over time. To address this challenge, we develop a machine learning approach using massive anonymised transactional data (4 million members and 2.5 billion transactions) in partnership with the retailer The Co-operative Group UK. We engineer a series of variables related to obesogenic diets, including a new measure called ‘Calorie-oriented purchasing’. These variables help illustrate how large-scale transactional data can discriminate between neighbourhoods most affected by deprivation and childhood obesity. Through comparative assessment of machine learning approaches, we find better performance from tree-based models (Random Forest, XGBoost) with the best-achieving accuracy of 0.88 for predicting deprivation and an accuracy of 0.79 for childhood obesity. Calorie-oriented purchasing emerges as a robust predictor of deprivation and childhood obesity at the census area level. Results show this approach can help summarise nutritional insecurity, and support its spatio-temporal monitoring. We conclude with policy implications and recommend retailers adopt new measures for measuring national nutrition insecurity.
期刊介绍:
Food Policy is a multidisciplinary journal publishing original research and novel evidence on issues in the formulation, implementation, and evaluation of policies for the food sector in developing, transition, and advanced economies.
Our main focus is on the economic and social aspect of food policy, and we prioritize empirical studies informing international food policy debates. Provided that articles make a clear and explicit contribution to food policy debates of international interest, we consider papers from any of the social sciences. Papers from other disciplines (e.g., law) will be considered only if they provide a key policy contribution, and are written in a style which is accessible to a social science readership.