{"title":"Identification of geographical origin and adulteration of Northeast China soybeans by mid-infrared spectroscopy and spectra augmentation","authors":"Yuhui Xiao, Honghao Cai, Hui Ni","doi":"10.1007/s00003-023-01471-8","DOIUrl":null,"url":null,"abstract":"<div><p>Mathematical models based on infrared spectroscopy and machine learning have been successfully used to trace the origin of soybeans. However, as previous research reported, it is necessary to employ spectra data that undergo multiple pre-processing operations in order to achieve optimal accuracy during model training. And these established models are only capable of predicting samples with identical spectra pre-processing. Specifically, baseline correction, which necessitates individual processing of each spectrum, requiring substantial investments of time and human resources with a large dataset. In this study, the spectra augmentation technique was proposed based on the theory of data augmentation, in order to simplify or even eliminate the pre-processing steps for the prediction dataset. The technique utilized a combination of the standard spectra pre-processed data and the “boost data” to train models, specifically, a total of 180 spectra, including 90 pre-processed standard spectra and 90 “boost” spectra. The “boost” data refers to data without the standard spectra pre-processing. On the prediction dataset without the standard spectra pre-processing, the model with the spectra augmentation technique had an accuracy of 0.91 for the recognition of Northeast China soybeans, while the accuracy of the model with the training method frequently reported in previous studies only reached 0.71, demonstrating that the model trained with the proposed technique possessed higher robustness and generalization capabilities. The spectra augmentation technique can maintain high accuracy while simplifying spectra pre-processing steps on prediction data, therefore, providing a more efficient and expedited method for practical food traceability and authentication.</p><h3>Graphical abstract</h3><div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":622,"journal":{"name":"Journal of Consumer Protection and Food Safety","volume":"19 1","pages":"99 - 111"},"PeriodicalIF":1.4000,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Consumer Protection and Food Safety","FirstCategoryId":"97","ListUrlMain":"https://link.springer.com/article/10.1007/s00003-023-01471-8","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Mathematical models based on infrared spectroscopy and machine learning have been successfully used to trace the origin of soybeans. However, as previous research reported, it is necessary to employ spectra data that undergo multiple pre-processing operations in order to achieve optimal accuracy during model training. And these established models are only capable of predicting samples with identical spectra pre-processing. Specifically, baseline correction, which necessitates individual processing of each spectrum, requiring substantial investments of time and human resources with a large dataset. In this study, the spectra augmentation technique was proposed based on the theory of data augmentation, in order to simplify or even eliminate the pre-processing steps for the prediction dataset. The technique utilized a combination of the standard spectra pre-processed data and the “boost data” to train models, specifically, a total of 180 spectra, including 90 pre-processed standard spectra and 90 “boost” spectra. The “boost” data refers to data without the standard spectra pre-processing. On the prediction dataset without the standard spectra pre-processing, the model with the spectra augmentation technique had an accuracy of 0.91 for the recognition of Northeast China soybeans, while the accuracy of the model with the training method frequently reported in previous studies only reached 0.71, demonstrating that the model trained with the proposed technique possessed higher robustness and generalization capabilities. The spectra augmentation technique can maintain high accuracy while simplifying spectra pre-processing steps on prediction data, therefore, providing a more efficient and expedited method for practical food traceability and authentication.
期刊介绍:
The JCF publishes peer-reviewed original Research Articles and Opinions that are of direct importance to Food and Feed Safety. This includes Food Packaging, Consumer Products as well as Plant Protection Products, Food Microbiology, Veterinary Drugs, Animal Welfare and Genetic Engineering.
All peer-reviewed articles that are published should be devoted to improve Consumer Health Protection. Reviews and discussions are welcomed that address legal and/or regulatory decisions with respect to risk assessment and management of Food and Feed Safety issues on a scientific basis. It addresses an international readership of scientists, risk assessors and managers, and other professionals active in the field of Food and Feed Safety and Consumer Health Protection.
Manuscripts – preferably written in English but also in German – are published as Research Articles, Reviews, Methods and Short Communications and should cover aspects including, but not limited to:
· Factors influencing Food and Feed Safety
· Factors influencing Consumer Health Protection
· Factors influencing Consumer Behavior
· Exposure science related to Risk Assessment and Risk Management
· Regulatory aspects related to Food and Feed Safety, Food Packaging, Consumer Products, Plant Protection Products, Food Microbiology, Veterinary Drugs, Animal Welfare and Genetic Engineering
· Analytical methods and method validation related to food control and food processing.
The JCF also presents important News, as well as Announcements and Reports about administrative surveillance.