Efficacy of machine learning models for the prediction of death occurrence and counts associated with foodborne illnesses and hospitalizations in the United States
Mohammed Rashad Baker , Selim Buyrukoğlu , Gonca Buyrukoğlu , Juan Moreira , Zeynal Topalcengiz
{"title":"Efficacy of machine learning models for the prediction of death occurrence and counts associated with foodborne illnesses and hospitalizations in the United States","authors":"Mohammed Rashad Baker , Selim Buyrukoğlu , Gonca Buyrukoğlu , Juan Moreira , Zeynal Topalcengiz","doi":"10.1016/j.mran.2025.100351","DOIUrl":null,"url":null,"abstract":"<div><div>Foodborne outbreak data released through national surveillance systems provides essential information about the results of investigations. This study evaluates the efficacy of machine learning (ML) models for the prediction of death occurrence and counts associated with foodborne illnesses and hospitalizations in the United States. Confirmed foodborne outbreaks were obtained from the Centers for Disease Control and Prevention's National Outbreak Reporting System (NORS). Foodborne pathogens causing at least 10 deaths in total were selected for analysis. The binary classification performance (accuracy, %) and prediction efficacy of ML models (mean absolute errors, MAE) were used for evaluation. A total of 10,069 foodborne outbreaks with confirmed single etiology resulted in 275,827 illnesses, 18,579 hospitalizations, and 458 deaths. <em>Salmonella</em> was the leading causative agent (54.23 %) of bacterial foodborne outbreaks, followed by pathogenic <em>Escherichia coli</em> (12.13 %). Norovirus (96.69 %) and <em>Cyclospora cayetanensis</em> (60.76 %) represented major causes of viral and protozoan/parasite foodborne outbreaks, respectively. The classification performance of ML models ranged from 88.9 to 94.5 % for the overall prediction of death occurrence associated with foodborne illnesses and hospitalizations. Prediction efficacy of ML models for death counts remained <0.9 with MAE, except for <em>Listeria monocytogenes</em> with an average MAE of 134.1 ± 11.1. This study indicates the potential use and performance of ML algorithms for the prediction of death occurrence or counts caused by foodborne etiological agents to improve public health safety based on the numbers of illnesses and hospitalizations.</div></div>","PeriodicalId":48593,"journal":{"name":"Microbial Risk Analysis","volume":"30 ","pages":"Article 100351"},"PeriodicalIF":4.0000,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microbial Risk Analysis","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352352225000118","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Foodborne outbreak data released through national surveillance systems provides essential information about the results of investigations. This study evaluates the efficacy of machine learning (ML) models for the prediction of death occurrence and counts associated with foodborne illnesses and hospitalizations in the United States. Confirmed foodborne outbreaks were obtained from the Centers for Disease Control and Prevention's National Outbreak Reporting System (NORS). Foodborne pathogens causing at least 10 deaths in total were selected for analysis. The binary classification performance (accuracy, %) and prediction efficacy of ML models (mean absolute errors, MAE) were used for evaluation. A total of 10,069 foodborne outbreaks with confirmed single etiology resulted in 275,827 illnesses, 18,579 hospitalizations, and 458 deaths. Salmonella was the leading causative agent (54.23 %) of bacterial foodborne outbreaks, followed by pathogenic Escherichia coli (12.13 %). Norovirus (96.69 %) and Cyclospora cayetanensis (60.76 %) represented major causes of viral and protozoan/parasite foodborne outbreaks, respectively. The classification performance of ML models ranged from 88.9 to 94.5 % for the overall prediction of death occurrence associated with foodborne illnesses and hospitalizations. Prediction efficacy of ML models for death counts remained <0.9 with MAE, except for Listeria monocytogenes with an average MAE of 134.1 ± 11.1. This study indicates the potential use and performance of ML algorithms for the prediction of death occurrence or counts caused by foodborne etiological agents to improve public health safety based on the numbers of illnesses and hospitalizations.
期刊介绍:
The journal Microbial Risk Analysis accepts articles dealing with the study of risk analysis applied to microbial hazards. Manuscripts should at least cover any of the components of risk assessment (risk characterization, exposure assessment, etc.), risk management and/or risk communication in any microbiology field (clinical, environmental, food, veterinary, etc.). This journal also accepts article dealing with predictive microbiology, quantitative microbial ecology, mathematical modeling, risk studies applied to microbial ecology, quantitative microbiology for epidemiological studies, statistical methods applied to microbiology, and laws and regulatory policies aimed at lessening the risk of microbial hazards. Work focusing on risk studies of viruses, parasites, microbial toxins, antimicrobial resistant organisms, genetically modified organisms (GMOs), and recombinant DNA products are also acceptable.