将机器学习方法作为预防意大利西北部爆发食源性沙门氏菌疫情的预警系统

IF 3.7 1区 农林科学 Q1 VETERINARY SCIENCES
Aitor Garcia-Vozmediano, Cristiana Maurella, Leonardo A. Ceballos, Elisabetta Crescio, Rosa Meo, Walter Martelli, Monica Pitti, Daniela Lombardi, Daniela Meloni, Chiara Pasqualini, Giuseppe Ru
{"title":"将机器学习方法作为预防意大利西北部爆发食源性沙门氏菌疫情的预警系统","authors":"Aitor Garcia-Vozmediano, Cristiana Maurella, Leonardo A. Ceballos, Elisabetta Crescio, Rosa Meo, Walter Martelli, Monica Pitti, Daniela Lombardi, Daniela Meloni, Chiara Pasqualini, Giuseppe Ru","doi":"10.1186/s13567-024-01323-9","DOIUrl":null,"url":null,"abstract":"Salmonellosis, one of the most common foodborne infections in Europe, is monitored by food safety surveillance programmes, resulting in the generation of extensive databases. By leveraging tree-based machine learning (ML) algorithms, we exploited data from food safety audits to predict spatiotemporal patterns of salmonellosis in northwestern Italy. Data on human cases confirmed in 2015–2018 (n = 1969) and food surveillance data collected in 2014–2018 were used to develop ML algorithms. We integrated the monthly municipal human incidence with 27 potential predictors, including the observed prevalence of Salmonella in food. We applied the tree regression, random forest and gradient boosting algorithms considering different scenarios and evaluated their predictivity in terms of the mean absolute percentage error (MAPE) and R2. Using a similar dataset from the year 2019, spatiotemporal predictions and their relative sensitivities and specificities were obtained. Random forest and gradient boosting (R2 = 0.55, MAPE = 7.5%) outperformed the tree regression algorithm (R2 = 0.42, MAPE = 8.8%). Salmonella prevalence in food; spatial features; and monitoring efforts in ready-to-eat milk, fruits and vegetables, and pig meat products contributed the most to the models’ predictivity, reducing the variance by 90.5%. Conversely, the number of positive samples obtained for specific food matrices minimally influenced the predictions (2.9%). Spatiotemporal predictions for 2019 showed sensitivity and specificity levels of 46.5% (due to the lack of some infection hotspots) and 78.5%, respectively. This study demonstrates the added value of integrating data from human and veterinary health services to develop predictive models of human salmonellosis occurrence, providing early warnings useful for mitigating foodborne disease impacts on public health.","PeriodicalId":23658,"journal":{"name":"Veterinary Research","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning approach as an early warning system to prevent foodborne Salmonella outbreaks in northwestern Italy\",\"authors\":\"Aitor Garcia-Vozmediano, Cristiana Maurella, Leonardo A. Ceballos, Elisabetta Crescio, Rosa Meo, Walter Martelli, Monica Pitti, Daniela Lombardi, Daniela Meloni, Chiara Pasqualini, Giuseppe Ru\",\"doi\":\"10.1186/s13567-024-01323-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Salmonellosis, one of the most common foodborne infections in Europe, is monitored by food safety surveillance programmes, resulting in the generation of extensive databases. By leveraging tree-based machine learning (ML) algorithms, we exploited data from food safety audits to predict spatiotemporal patterns of salmonellosis in northwestern Italy. Data on human cases confirmed in 2015–2018 (n = 1969) and food surveillance data collected in 2014–2018 were used to develop ML algorithms. We integrated the monthly municipal human incidence with 27 potential predictors, including the observed prevalence of Salmonella in food. We applied the tree regression, random forest and gradient boosting algorithms considering different scenarios and evaluated their predictivity in terms of the mean absolute percentage error (MAPE) and R2. Using a similar dataset from the year 2019, spatiotemporal predictions and their relative sensitivities and specificities were obtained. Random forest and gradient boosting (R2 = 0.55, MAPE = 7.5%) outperformed the tree regression algorithm (R2 = 0.42, MAPE = 8.8%). Salmonella prevalence in food; spatial features; and monitoring efforts in ready-to-eat milk, fruits and vegetables, and pig meat products contributed the most to the models’ predictivity, reducing the variance by 90.5%. Conversely, the number of positive samples obtained for specific food matrices minimally influenced the predictions (2.9%). Spatiotemporal predictions for 2019 showed sensitivity and specificity levels of 46.5% (due to the lack of some infection hotspots) and 78.5%, respectively. This study demonstrates the added value of integrating data from human and veterinary health services to develop predictive models of human salmonellosis occurrence, providing early warnings useful for mitigating foodborne disease impacts on public health.\",\"PeriodicalId\":23658,\"journal\":{\"name\":\"Veterinary Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Veterinary Research\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://doi.org/10.1186/s13567-024-01323-9\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"VETERINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Veterinary Research","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1186/s13567-024-01323-9","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"VETERINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

沙门氏菌病是欧洲最常见的食源性传染病之一,食品安全监督计划对其进行监控,从而产生了大量数据库。通过利用基于树的机器学习(ML)算法,我们利用食品安全审计数据来预测意大利西北部沙门氏菌病的时空模式。2015-2018 年确诊的人类病例数据(n = 1969)和 2014-2018 年收集的食品监测数据被用于开发 ML 算法。我们将每月市级人类发病率与 27 个潜在预测因子(包括观察到的食品中沙门氏菌的流行率)进行了整合。我们应用了树回归、随机森林和梯度提升算法,考虑了不同的情况,并根据平均绝对百分比误差(MAPE)和 R2 评估了它们的预测能力。利用 2019 年的类似数据集,获得了时空预测及其相对敏感性和特异性。随机森林和梯度提升(R2 = 0.55,MAPE = 7.5%)优于树回归算法(R2 = 0.42,MAPE = 8.8%)。食品中沙门氏菌的流行率、空间特征以及对即食牛奶、水果和蔬菜以及猪肉产品的监测工作对模型的预测能力贡献最大,减少了 90.5% 的方差。相反,特定食物基质中获得的阳性样本数量对预测的影响很小(2.9%)。对 2019 年的时空预测显示,灵敏度和特异度分别为 46.5%(由于缺乏一些感染热点)和 78.5%。这项研究证明了整合人类和兽医健康服务数据以开发人类沙门氏菌病发生预测模型的附加价值,可提供早期预警,有助于减轻食源性疾病对公共健康的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Machine learning approach as an early warning system to prevent foodborne Salmonella outbreaks in northwestern Italy
Salmonellosis, one of the most common foodborne infections in Europe, is monitored by food safety surveillance programmes, resulting in the generation of extensive databases. By leveraging tree-based machine learning (ML) algorithms, we exploited data from food safety audits to predict spatiotemporal patterns of salmonellosis in northwestern Italy. Data on human cases confirmed in 2015–2018 (n = 1969) and food surveillance data collected in 2014–2018 were used to develop ML algorithms. We integrated the monthly municipal human incidence with 27 potential predictors, including the observed prevalence of Salmonella in food. We applied the tree regression, random forest and gradient boosting algorithms considering different scenarios and evaluated their predictivity in terms of the mean absolute percentage error (MAPE) and R2. Using a similar dataset from the year 2019, spatiotemporal predictions and their relative sensitivities and specificities were obtained. Random forest and gradient boosting (R2 = 0.55, MAPE = 7.5%) outperformed the tree regression algorithm (R2 = 0.42, MAPE = 8.8%). Salmonella prevalence in food; spatial features; and monitoring efforts in ready-to-eat milk, fruits and vegetables, and pig meat products contributed the most to the models’ predictivity, reducing the variance by 90.5%. Conversely, the number of positive samples obtained for specific food matrices minimally influenced the predictions (2.9%). Spatiotemporal predictions for 2019 showed sensitivity and specificity levels of 46.5% (due to the lack of some infection hotspots) and 78.5%, respectively. This study demonstrates the added value of integrating data from human and veterinary health services to develop predictive models of human salmonellosis occurrence, providing early warnings useful for mitigating foodborne disease impacts on public health.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Veterinary Research
Veterinary Research 农林科学-兽医学
CiteScore
7.00
自引率
4.50%
发文量
92
审稿时长
3 months
期刊介绍: Veterinary Research is an open access journal that publishes high quality and novel research and review articles focusing on all aspects of infectious diseases and host-pathogen interaction in animals.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信