Nuno M. Rodrigues , Vanessa F. Fonseca , Renato Mamede , Susanne E. Tanner , Bernardo Duarte , Sara Silva
{"title":"集成元素和全反射x射线光谱指纹与机器学习先进的海鲜可追溯性","authors":"Nuno M. Rodrigues , Vanessa F. Fonseca , Renato Mamede , Susanne E. Tanner , Bernardo Duarte , Sara Silva","doi":"10.1016/j.foodcont.2025.111682","DOIUrl":null,"url":null,"abstract":"<div><div>The increased consumption and demand for seafood have led to increases in food fraud and seafood traceability regulations to ensure sustainable marine resources and food safety. Cephalopods are valuable commercial seafood resources that are highly susceptible to mislabelling. The common octopus <em>Octopus vulgaris</em> has a high market value in southern Europe, including Portugal. We collected 450 <em>O. vulgaris</em> individuals from three sampling events and five fishing areas along the Portuguese Atlantic coast to assess edible muscle tissue elemental fingerprints and spectral data as inputs for machine learning models for provenance assessment. We trained several machine learning models on multitemporal data, showing trade-offs between performance and interpretability. The analysis revealed clear distribution shifts between seasons, which are likely associated with the complex life cycle of the species. It is evident that the performance of the models varied significantly depending on the seasonal data utilized. Models trained with data from October and September, as well as their combination (Autumn), significantly outperformed those trained with April data. This variation was attributed to the complex life cycle of the species under study. This trend persisted regardless of the data type employed, whether elemental fingerprinting or spectral analysis, although elemental signature data generally yielded statistically superior results across most models. Notably, both October and September, which are similar seasons, exhibited a slight preference for spectral data. Conversely, both the combination (Autumn) and April demonstrated a clear preference for elemental fingerprint data. In terms of the overall model performance, boosted trees emerged as the top performers. Overall, irrespective of the season, boosted tree models consistently ranked as the top-performing methods, although they were not statistically significantly superior in most instances. Upon analysing the results, it is apparent that the majority of observations made for the cross-validation data remain valid. April data continue to produce the least effective results, with fingerprint and spectral data differing by only 0.02%. The season yielding the best-performing model was September, with the full fingerprint Random Forests achieving an F1 score of 73%. The developed models achieved very good predictive performance with high generalization capability, and the analysis of feature importance using Shapley Additive Explanations (SHAP) showed that model decisions can be biologically validated, with As being a key element in all sampling moments for the differentiation of animal capture areas.</div></div>","PeriodicalId":319,"journal":{"name":"Food Control","volume":"181 ","pages":"Article 111682"},"PeriodicalIF":6.3000,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integrating elemental and total reflection X-ray spectral fingerprints with machine learning for advanced seafood traceability\",\"authors\":\"Nuno M. Rodrigues , Vanessa F. Fonseca , Renato Mamede , Susanne E. Tanner , Bernardo Duarte , Sara Silva\",\"doi\":\"10.1016/j.foodcont.2025.111682\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The increased consumption and demand for seafood have led to increases in food fraud and seafood traceability regulations to ensure sustainable marine resources and food safety. Cephalopods are valuable commercial seafood resources that are highly susceptible to mislabelling. The common octopus <em>Octopus vulgaris</em> has a high market value in southern Europe, including Portugal. We collected 450 <em>O. vulgaris</em> individuals from three sampling events and five fishing areas along the Portuguese Atlantic coast to assess edible muscle tissue elemental fingerprints and spectral data as inputs for machine learning models for provenance assessment. We trained several machine learning models on multitemporal data, showing trade-offs between performance and interpretability. The analysis revealed clear distribution shifts between seasons, which are likely associated with the complex life cycle of the species. It is evident that the performance of the models varied significantly depending on the seasonal data utilized. Models trained with data from October and September, as well as their combination (Autumn), significantly outperformed those trained with April data. This variation was attributed to the complex life cycle of the species under study. This trend persisted regardless of the data type employed, whether elemental fingerprinting or spectral analysis, although elemental signature data generally yielded statistically superior results across most models. Notably, both October and September, which are similar seasons, exhibited a slight preference for spectral data. Conversely, both the combination (Autumn) and April demonstrated a clear preference for elemental fingerprint data. In terms of the overall model performance, boosted trees emerged as the top performers. Overall, irrespective of the season, boosted tree models consistently ranked as the top-performing methods, although they were not statistically significantly superior in most instances. Upon analysing the results, it is apparent that the majority of observations made for the cross-validation data remain valid. April data continue to produce the least effective results, with fingerprint and spectral data differing by only 0.02%. The season yielding the best-performing model was September, with the full fingerprint Random Forests achieving an F1 score of 73%. The developed models achieved very good predictive performance with high generalization capability, and the analysis of feature importance using Shapley Additive Explanations (SHAP) showed that model decisions can be biologically validated, with As being a key element in all sampling moments for the differentiation of animal capture areas.</div></div>\",\"PeriodicalId\":319,\"journal\":{\"name\":\"Food Control\",\"volume\":\"181 \",\"pages\":\"Article 111682\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Food Control\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0956713525005511\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"FOOD SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Food Control","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0956713525005511","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}
Integrating elemental and total reflection X-ray spectral fingerprints with machine learning for advanced seafood traceability
The increased consumption and demand for seafood have led to increases in food fraud and seafood traceability regulations to ensure sustainable marine resources and food safety. Cephalopods are valuable commercial seafood resources that are highly susceptible to mislabelling. The common octopus Octopus vulgaris has a high market value in southern Europe, including Portugal. We collected 450 O. vulgaris individuals from three sampling events and five fishing areas along the Portuguese Atlantic coast to assess edible muscle tissue elemental fingerprints and spectral data as inputs for machine learning models for provenance assessment. We trained several machine learning models on multitemporal data, showing trade-offs between performance and interpretability. The analysis revealed clear distribution shifts between seasons, which are likely associated with the complex life cycle of the species. It is evident that the performance of the models varied significantly depending on the seasonal data utilized. Models trained with data from October and September, as well as their combination (Autumn), significantly outperformed those trained with April data. This variation was attributed to the complex life cycle of the species under study. This trend persisted regardless of the data type employed, whether elemental fingerprinting or spectral analysis, although elemental signature data generally yielded statistically superior results across most models. Notably, both October and September, which are similar seasons, exhibited a slight preference for spectral data. Conversely, both the combination (Autumn) and April demonstrated a clear preference for elemental fingerprint data. In terms of the overall model performance, boosted trees emerged as the top performers. Overall, irrespective of the season, boosted tree models consistently ranked as the top-performing methods, although they were not statistically significantly superior in most instances. Upon analysing the results, it is apparent that the majority of observations made for the cross-validation data remain valid. April data continue to produce the least effective results, with fingerprint and spectral data differing by only 0.02%. The season yielding the best-performing model was September, with the full fingerprint Random Forests achieving an F1 score of 73%. The developed models achieved very good predictive performance with high generalization capability, and the analysis of feature importance using Shapley Additive Explanations (SHAP) showed that model decisions can be biologically validated, with As being a key element in all sampling moments for the differentiation of animal capture areas.
期刊介绍:
Food Control is an international journal that provides essential information for those involved in food safety and process control.
Food Control covers the below areas that relate to food process control or to food safety of human foods:
• Microbial food safety and antimicrobial systems
• Mycotoxins
• Hazard analysis, HACCP and food safety objectives
• Risk assessment, including microbial and chemical hazards
• Quality assurance
• Good manufacturing practices
• Food process systems design and control
• Food Packaging technology and materials in contact with foods
• Rapid methods of analysis and detection, including sensor technology
• Codes of practice, legislation and international harmonization
• Consumer issues
• Education, training and research needs.
The scope of Food Control is comprehensive and includes original research papers, authoritative reviews, short communications, comment articles that report on new developments in food control, and position papers.