将实体食品店与在线平台联系起来:分析英国社会经济差异的横断面机器学习方法。

IF 4.1
Health & place Pub Date : 2025-09-01 Epub Date: 2025-08-07 DOI:10.1016/j.healthplace.2025.103524
Jody C Hoenink, Yuru Huang, Jean Adams
{"title":"将实体食品店与在线平台联系起来:分析英国社会经济差异的横断面机器学习方法。","authors":"Jody C Hoenink, Yuru Huang, Jean Adams","doi":"10.1016/j.healthplace.2025.103524","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Physical food outlets are increasingly offering delivery through Online Food Delivery Service (OFDS) platforms, but the scale of this expansion remains unclear due to the labour-intensive process of manually matching outlets to online platforms. Understanding the share of outlets offering delivery is important, as it impacts food availability and thus potentially influences dietary behaviours. This paper demonstrates how a machine learning model can efficiently match physical to online outlets. We also analysed how the proportion of physical outlets listed online and online-only outlets varies by area-level deprivation.</p><p><strong>Methods: </strong>The physical locations of outlets selling food in Great Britain was obtained from a centrally held register for food hygiene data, while online outlet data was collected through web scraping an OFDS platform. We calculated string distances based on outlet names and postcodes, which were then used to train a Random Forest model to match outlets from the two lists. Area-level deprivation was assessed using the Index of Multiple Deprivation.</p><p><strong>Results: </strong>The Random Forest classifier model achieved an F1 score of 90 %, a recall of 98 %, and a precision of 83 %. Overall, the median percentage of physical outlets also listed online was 14 % (IQR 0-23), and the median percentage of online-only outlets was also 14 % (IQR 0-27). The proportion of physical outlets listed online and online-only outlets was highest in more deprived areas. For example, compared to the least deprived areas, the most deprived areas were associated with a 6 % greater proportion of physical food outlets listed online (95 %CI 5 %-6 %) and a 3 % greater proportion of online-only outlets (95 %CI 1 %-4 %).</p><p><strong>Conclusion: </strong>This study demonstrates the potential of machine learning techniques to efficiently match physical and online food outlets. This automated approach can provide insights into the relationship between physical and online food availability. Researchers and policymakers can use this method to better understand inequalities in food outlet availability and monitor the expansion of online delivery services.</p>","PeriodicalId":94024,"journal":{"name":"Health & place","volume":"95 ","pages":"103524"},"PeriodicalIF":4.1000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Linking physical food outlets to online platforms: A cross-sectional machine learning approach to analysing socioeconomic variations in great Britain.\",\"authors\":\"Jody C Hoenink, Yuru Huang, Jean Adams\",\"doi\":\"10.1016/j.healthplace.2025.103524\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>Physical food outlets are increasingly offering delivery through Online Food Delivery Service (OFDS) platforms, but the scale of this expansion remains unclear due to the labour-intensive process of manually matching outlets to online platforms. Understanding the share of outlets offering delivery is important, as it impacts food availability and thus potentially influences dietary behaviours. This paper demonstrates how a machine learning model can efficiently match physical to online outlets. We also analysed how the proportion of physical outlets listed online and online-only outlets varies by area-level deprivation.</p><p><strong>Methods: </strong>The physical locations of outlets selling food in Great Britain was obtained from a centrally held register for food hygiene data, while online outlet data was collected through web scraping an OFDS platform. We calculated string distances based on outlet names and postcodes, which were then used to train a Random Forest model to match outlets from the two lists. Area-level deprivation was assessed using the Index of Multiple Deprivation.</p><p><strong>Results: </strong>The Random Forest classifier model achieved an F1 score of 90 %, a recall of 98 %, and a precision of 83 %. Overall, the median percentage of physical outlets also listed online was 14 % (IQR 0-23), and the median percentage of online-only outlets was also 14 % (IQR 0-27). The proportion of physical outlets listed online and online-only outlets was highest in more deprived areas. For example, compared to the least deprived areas, the most deprived areas were associated with a 6 % greater proportion of physical food outlets listed online (95 %CI 5 %-6 %) and a 3 % greater proportion of online-only outlets (95 %CI 1 %-4 %).</p><p><strong>Conclusion: </strong>This study demonstrates the potential of machine learning techniques to efficiently match physical and online food outlets. This automated approach can provide insights into the relationship between physical and online food availability. Researchers and policymakers can use this method to better understand inequalities in food outlet availability and monitor the expansion of online delivery services.</p>\",\"PeriodicalId\":94024,\"journal\":{\"name\":\"Health & place\",\"volume\":\"95 \",\"pages\":\"103524\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Health & place\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.healthplace.2025.103524\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/8/7 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health & place","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.healthplace.2025.103524","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/7 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目标:实体食品店越来越多地通过在线食品配送服务(OFDS)平台提供外卖服务,但由于手工将门店与在线平台相匹配的劳动密集型过程,这种扩张的规模尚不清楚。了解提供外卖服务的门店所占的份额很重要,因为它影响食品供应,从而可能影响饮食行为。本文演示了机器学习模型如何有效地将实体店与在线商店相匹配。我们还分析了线上实体店和纯线上实体店的比例如何因地区匮乏而变化。方法:在英国销售食品的网点的物理位置从中央持有的食品卫生数据登记册中获得,而在线网点数据通过OFDS平台的网络抓取收集。我们根据销售点名称和邮政编码计算字符串距离,然后用它来训练随机森林模型来匹配两个列表中的销售点。使用多重剥夺指数评估区域水平的剥夺。结果:随机森林分类器模型的F1得分为90%,召回率为98%,准确率为83%。总体而言,同时在网上上市的实体店的中位数百分比为14% (IQR 0-23),而只在网上上市的实体店的中位数百分比也为14% (IQR 0-27)。在更贫困的地区,网上实体店和纯网上实体店的比例最高。例如,与最不贫困的地区相比,最贫困的地区与网上列出的实体食品商店的比例高出6%(95%置信区间为5% - 6%)和仅在线销售的商店的比例高出3%(95%置信区间为1% - 4%)相关。结论:这项研究证明了机器学习技术在有效匹配实体和在线食品销售点方面的潜力。这种自动化的方法可以深入了解实体和在线食品供应之间的关系。研究人员和政策制定者可以利用这种方法更好地了解食品店供应方面的不平等,并监测在线配送服务的扩展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Linking physical food outlets to online platforms: A cross-sectional machine learning approach to analysing socioeconomic variations in great Britain.

Objectives: Physical food outlets are increasingly offering delivery through Online Food Delivery Service (OFDS) platforms, but the scale of this expansion remains unclear due to the labour-intensive process of manually matching outlets to online platforms. Understanding the share of outlets offering delivery is important, as it impacts food availability and thus potentially influences dietary behaviours. This paper demonstrates how a machine learning model can efficiently match physical to online outlets. We also analysed how the proportion of physical outlets listed online and online-only outlets varies by area-level deprivation.

Methods: The physical locations of outlets selling food in Great Britain was obtained from a centrally held register for food hygiene data, while online outlet data was collected through web scraping an OFDS platform. We calculated string distances based on outlet names and postcodes, which were then used to train a Random Forest model to match outlets from the two lists. Area-level deprivation was assessed using the Index of Multiple Deprivation.

Results: The Random Forest classifier model achieved an F1 score of 90 %, a recall of 98 %, and a precision of 83 %. Overall, the median percentage of physical outlets also listed online was 14 % (IQR 0-23), and the median percentage of online-only outlets was also 14 % (IQR 0-27). The proportion of physical outlets listed online and online-only outlets was highest in more deprived areas. For example, compared to the least deprived areas, the most deprived areas were associated with a 6 % greater proportion of physical food outlets listed online (95 %CI 5 %-6 %) and a 3 % greater proportion of online-only outlets (95 %CI 1 %-4 %).

Conclusion: This study demonstrates the potential of machine learning techniques to efficiently match physical and online food outlets. This automated approach can provide insights into the relationship between physical and online food availability. Researchers and policymakers can use this method to better understand inequalities in food outlet availability and monitor the expansion of online delivery services.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信