{"title":"基于可解释集成学习方法的傅里叶变换近红外光谱检测玉米油中矿物油污染","authors":"Jihong Deng , Hui Jiang , Quansheng Chen","doi":"10.1016/j.jfca.2025.107594","DOIUrl":null,"url":null,"abstract":"<div><div>Corn oil is rich in unsaturated fatty acids and antioxidants, which contribute to cardiovascular health. This has led to its widespread use in food processing and cooking. However, during the production, transportation, and storage of corn oil, it can be exposed to mineral oil contamination. Therefore, ensuring the quality and safety of corn oil is crucial. At the same time, there is growing attention on developing rapid and environmentally friendly analytical monitoring tools to screen edible oils for impurities, ensuring their quality. This study introduced an innovative method that combines explainable artificial intelligence with Fourier Transform Near-Infrared Spectroscopy (FT-NIR) to detect mineral oil contaminants in corn oil. Five types of mineral oils were selected as potential pollutants, and spectral data from contaminated and uncontaminated corn oil samples were collected. Partial Least Squares Discriminant Analysis (PLS-DA) and ensemble learning methods, AdaBoost, XGBoost, LightGBM, and CatBoost, were applied to address two qualitative objectives. The results showed that PLS-DA effectively captured the spectral differences between normal and contaminated samples, achieving 100 % classification accuracy. Following this, four classifiers were developed using spectral data selected by Competitive Adaptive Reweighted Sampling (CARS) to identify specific contaminants in corn oil. LightGBM demonstrated the best performance, achieving 100 % accuracy, precision, recall, and F1 score across all contaminant categories. The Shapley Additive Explanations (SHAP) algorithm was also used to enhance model interpretability. This algorithm identified the key spectral wavelengths contributing to the classification of each contaminant category. The findings demonstrate that combining FT-NIR, feature selection, and explainable models provides a fast, accurate, and environmentally friendly method for assessing the quality of corn oil. This approach improves contamination detection and enhances consumer confidence in edible oil products. Future work should focus on extending this method to other edible oil safety applications and integrating it into real-time on-site monitoring systems for edible oil production.</div></div>","PeriodicalId":15867,"journal":{"name":"Journal of Food Composition and Analysis","volume":"143 ","pages":"Article 107594"},"PeriodicalIF":4.0000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Fourier Transform Near-infrared Spectroscopy with Explainable Ensemble Learning Methods for Detecting Mineral Oil Contamination in Corn Oil\",\"authors\":\"Jihong Deng , Hui Jiang , Quansheng Chen\",\"doi\":\"10.1016/j.jfca.2025.107594\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Corn oil is rich in unsaturated fatty acids and antioxidants, which contribute to cardiovascular health. This has led to its widespread use in food processing and cooking. However, during the production, transportation, and storage of corn oil, it can be exposed to mineral oil contamination. Therefore, ensuring the quality and safety of corn oil is crucial. At the same time, there is growing attention on developing rapid and environmentally friendly analytical monitoring tools to screen edible oils for impurities, ensuring their quality. This study introduced an innovative method that combines explainable artificial intelligence with Fourier Transform Near-Infrared Spectroscopy (FT-NIR) to detect mineral oil contaminants in corn oil. Five types of mineral oils were selected as potential pollutants, and spectral data from contaminated and uncontaminated corn oil samples were collected. Partial Least Squares Discriminant Analysis (PLS-DA) and ensemble learning methods, AdaBoost, XGBoost, LightGBM, and CatBoost, were applied to address two qualitative objectives. The results showed that PLS-DA effectively captured the spectral differences between normal and contaminated samples, achieving 100 % classification accuracy. Following this, four classifiers were developed using spectral data selected by Competitive Adaptive Reweighted Sampling (CARS) to identify specific contaminants in corn oil. LightGBM demonstrated the best performance, achieving 100 % accuracy, precision, recall, and F1 score across all contaminant categories. The Shapley Additive Explanations (SHAP) algorithm was also used to enhance model interpretability. This algorithm identified the key spectral wavelengths contributing to the classification of each contaminant category. The findings demonstrate that combining FT-NIR, feature selection, and explainable models provides a fast, accurate, and environmentally friendly method for assessing the quality of corn oil. This approach improves contamination detection and enhances consumer confidence in edible oil products. Future work should focus on extending this method to other edible oil safety applications and integrating it into real-time on-site monitoring systems for edible oil production.</div></div>\",\"PeriodicalId\":15867,\"journal\":{\"name\":\"Journal of Food Composition and Analysis\",\"volume\":\"143 \",\"pages\":\"Article 107594\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2025-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Food Composition and Analysis\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0889157525004090\",\"RegionNum\":2,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Food Composition and Analysis","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0889157525004090","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
Enhancing Fourier Transform Near-infrared Spectroscopy with Explainable Ensemble Learning Methods for Detecting Mineral Oil Contamination in Corn Oil
Corn oil is rich in unsaturated fatty acids and antioxidants, which contribute to cardiovascular health. This has led to its widespread use in food processing and cooking. However, during the production, transportation, and storage of corn oil, it can be exposed to mineral oil contamination. Therefore, ensuring the quality and safety of corn oil is crucial. At the same time, there is growing attention on developing rapid and environmentally friendly analytical monitoring tools to screen edible oils for impurities, ensuring their quality. This study introduced an innovative method that combines explainable artificial intelligence with Fourier Transform Near-Infrared Spectroscopy (FT-NIR) to detect mineral oil contaminants in corn oil. Five types of mineral oils were selected as potential pollutants, and spectral data from contaminated and uncontaminated corn oil samples were collected. Partial Least Squares Discriminant Analysis (PLS-DA) and ensemble learning methods, AdaBoost, XGBoost, LightGBM, and CatBoost, were applied to address two qualitative objectives. The results showed that PLS-DA effectively captured the spectral differences between normal and contaminated samples, achieving 100 % classification accuracy. Following this, four classifiers were developed using spectral data selected by Competitive Adaptive Reweighted Sampling (CARS) to identify specific contaminants in corn oil. LightGBM demonstrated the best performance, achieving 100 % accuracy, precision, recall, and F1 score across all contaminant categories. The Shapley Additive Explanations (SHAP) algorithm was also used to enhance model interpretability. This algorithm identified the key spectral wavelengths contributing to the classification of each contaminant category. The findings demonstrate that combining FT-NIR, feature selection, and explainable models provides a fast, accurate, and environmentally friendly method for assessing the quality of corn oil. This approach improves contamination detection and enhances consumer confidence in edible oil products. Future work should focus on extending this method to other edible oil safety applications and integrating it into real-time on-site monitoring systems for edible oil production.
期刊介绍:
The Journal of Food Composition and Analysis publishes manuscripts on scientific aspects of data on the chemical composition of human foods, with particular emphasis on actual data on composition of foods; analytical methods; studies on the manipulation, storage, distribution and use of food composition data; and studies on the statistics, use and distribution of such data and data systems. The Journal''s basis is nutrient composition, with increasing emphasis on bioactive non-nutrient and anti-nutrient components. Papers must provide sufficient description of the food samples, analytical methods, quality control procedures and statistical treatments of the data to permit the end users of the food composition data to evaluate the appropriateness of such data in their projects.
The Journal does not publish papers on: microbiological compounds; sensory quality; aromatics/volatiles in food and wine; essential oils; organoleptic characteristics of food; physical properties; or clinical papers and pharmacology-related papers.