{"title":"将实体食品店与在线平台联系起来:分析英国社会经济差异的横断面机器学习方法。","authors":"Jody C Hoenink, Yuru Huang, Jean Adams","doi":"10.1016/j.healthplace.2025.103524","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Physical food outlets are increasingly offering delivery through Online Food Delivery Service (OFDS) platforms, but the scale of this expansion remains unclear due to the labour-intensive process of manually matching outlets to online platforms. Understanding the share of outlets offering delivery is important, as it impacts food availability and thus potentially influences dietary behaviours. This paper demonstrates how a machine learning model can efficiently match physical to online outlets. We also analysed how the proportion of physical outlets listed online and online-only outlets varies by area-level deprivation.</p><p><strong>Methods: </strong>The physical locations of outlets selling food in Great Britain was obtained from a centrally held register for food hygiene data, while online outlet data was collected through web scraping an OFDS platform. We calculated string distances based on outlet names and postcodes, which were then used to train a Random Forest model to match outlets from the two lists. Area-level deprivation was assessed using the Index of Multiple Deprivation.</p><p><strong>Results: </strong>The Random Forest classifier model achieved an F1 score of 90 %, a recall of 98 %, and a precision of 83 %. Overall, the median percentage of physical outlets also listed online was 14 % (IQR 0-23), and the median percentage of online-only outlets was also 14 % (IQR 0-27). The proportion of physical outlets listed online and online-only outlets was highest in more deprived areas. For example, compared to the least deprived areas, the most deprived areas were associated with a 6 % greater proportion of physical food outlets listed online (95 %CI 5 %-6 %) and a 3 % greater proportion of online-only outlets (95 %CI 1 %-4 %).</p><p><strong>Conclusion: </strong>This study demonstrates the potential of machine learning techniques to efficiently match physical and online food outlets. This automated approach can provide insights into the relationship between physical and online food availability. Researchers and policymakers can use this method to better understand inequalities in food outlet availability and monitor the expansion of online delivery services.</p>","PeriodicalId":94024,"journal":{"name":"Health & place","volume":"95 ","pages":"103524"},"PeriodicalIF":4.1000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Linking physical food outlets to online platforms: A cross-sectional machine learning approach to analysing socioeconomic variations in great Britain.\",\"authors\":\"Jody C Hoenink, Yuru Huang, Jean Adams\",\"doi\":\"10.1016/j.healthplace.2025.103524\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>Physical food outlets are increasingly offering delivery through Online Food Delivery Service (OFDS) platforms, but the scale of this expansion remains unclear due to the labour-intensive process of manually matching outlets to online platforms. Understanding the share of outlets offering delivery is important, as it impacts food availability and thus potentially influences dietary behaviours. This paper demonstrates how a machine learning model can efficiently match physical to online outlets. We also analysed how the proportion of physical outlets listed online and online-only outlets varies by area-level deprivation.</p><p><strong>Methods: </strong>The physical locations of outlets selling food in Great Britain was obtained from a centrally held register for food hygiene data, while online outlet data was collected through web scraping an OFDS platform. We calculated string distances based on outlet names and postcodes, which were then used to train a Random Forest model to match outlets from the two lists. Area-level deprivation was assessed using the Index of Multiple Deprivation.</p><p><strong>Results: </strong>The Random Forest classifier model achieved an F1 score of 90 %, a recall of 98 %, and a precision of 83 %. Overall, the median percentage of physical outlets also listed online was 14 % (IQR 0-23), and the median percentage of online-only outlets was also 14 % (IQR 0-27). The proportion of physical outlets listed online and online-only outlets was highest in more deprived areas. For example, compared to the least deprived areas, the most deprived areas were associated with a 6 % greater proportion of physical food outlets listed online (95 %CI 5 %-6 %) and a 3 % greater proportion of online-only outlets (95 %CI 1 %-4 %).</p><p><strong>Conclusion: </strong>This study demonstrates the potential of machine learning techniques to efficiently match physical and online food outlets. This automated approach can provide insights into the relationship between physical and online food availability. Researchers and policymakers can use this method to better understand inequalities in food outlet availability and monitor the expansion of online delivery services.</p>\",\"PeriodicalId\":94024,\"journal\":{\"name\":\"Health & place\",\"volume\":\"95 \",\"pages\":\"103524\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Health & place\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.healthplace.2025.103524\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/8/7 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health & place","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.healthplace.2025.103524","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/7 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
Linking physical food outlets to online platforms: A cross-sectional machine learning approach to analysing socioeconomic variations in great Britain.
Objectives: Physical food outlets are increasingly offering delivery through Online Food Delivery Service (OFDS) platforms, but the scale of this expansion remains unclear due to the labour-intensive process of manually matching outlets to online platforms. Understanding the share of outlets offering delivery is important, as it impacts food availability and thus potentially influences dietary behaviours. This paper demonstrates how a machine learning model can efficiently match physical to online outlets. We also analysed how the proportion of physical outlets listed online and online-only outlets varies by area-level deprivation.
Methods: The physical locations of outlets selling food in Great Britain was obtained from a centrally held register for food hygiene data, while online outlet data was collected through web scraping an OFDS platform. We calculated string distances based on outlet names and postcodes, which were then used to train a Random Forest model to match outlets from the two lists. Area-level deprivation was assessed using the Index of Multiple Deprivation.
Results: The Random Forest classifier model achieved an F1 score of 90 %, a recall of 98 %, and a precision of 83 %. Overall, the median percentage of physical outlets also listed online was 14 % (IQR 0-23), and the median percentage of online-only outlets was also 14 % (IQR 0-27). The proportion of physical outlets listed online and online-only outlets was highest in more deprived areas. For example, compared to the least deprived areas, the most deprived areas were associated with a 6 % greater proportion of physical food outlets listed online (95 %CI 5 %-6 %) and a 3 % greater proportion of online-only outlets (95 %CI 1 %-4 %).
Conclusion: This study demonstrates the potential of machine learning techniques to efficiently match physical and online food outlets. This automated approach can provide insights into the relationship between physical and online food availability. Researchers and policymakers can use this method to better understand inequalities in food outlet availability and monitor the expansion of online delivery services.