{"title":"高光谱成像结合优化的SPA-GA和CPO-SVR机器学习模型快速测定窖土全氮含量","authors":"Yifei Zhou, Jianping Tian, Xinjun Hu, Haili Yang, Liangliang Xie, Yuexiang Huang, Yuanyuan Xia, Jianheng Peng, Dan Huang","doi":"10.1002/saj2.70095","DOIUrl":null,"url":null,"abstract":"<p>The fermentation of Baijiu grains in the cellar is significantly influenced by the quality of the cellar soil, which contains a diverse range of microorganisms and physicochemical components. Among these, the total nitrogen content (TNC) is a critical indicator of soil quality and thus requires real-time monitoring to ensure quality control of the Baijiu. In this study, we developed two optimized machine learning algorithms—successive projection algorithm-genetic algorithm (SPA-GA) and crown porcupine optimization (CPO) achieve the rapid and accurate detection of the TNC in cellar soil using hyperspectral imaging (HSI). The feature wavelengths were selected by combining the SPA with the GA. Subsequently, the support vector machine regression (SVR) algorithm was further optimized using the CPO algorithm to establish a prediction model for determining the TNC. Comparative analysis of the various models demonstrated that the CPO-SVR model based on the feature wavelength spectral data extracted by the SPA-GA exhibited the best performance (<span></span><math>\n <semantics>\n <mrow>\n <msup>\n <msub>\n <mi>R</mi>\n <mi>p</mi>\n </msub>\n <mn>2</mn>\n </msup>\n <mspace></mspace>\n </mrow>\n <annotation>${R_p}^{\\mathrm{2}}\\;$</annotation>\n </semantics></math>= 0.9958, root-mean square error of prediction [RMSEP] = 0.0073 g/100 g). This model reduced the number of wavelengths by 86.16%, increased the <span></span><math>\n <semantics>\n <mrow>\n <mspace></mspace>\n <msup>\n <msub>\n <mi>R</mi>\n <mi>p</mi>\n </msub>\n <mn>2</mn>\n </msup>\n </mrow>\n <annotation>$\\;{R_p}^{\\mathrm{2}}$</annotation>\n </semantics></math> by 0.3014, and decreased the RMSEP by 0.0566 compared to the same model built using the full-wavelength spectral data. These results indicated that the GA significantly enhanced the feature extraction capability of the SPA, thereby improving the model accuracy while reducing the number of wavelengths to reduce computational load. Furthermore, CPO was introduced to optimize the SVR, yielding the optimal parameter combination, which further improved the prediction model performance and accuracy while mitigating artificial parameter-seeking instability. HSI, in conjunction with the optimization algorithms, offers a novel method for the rapid, non-destructive detection of total nitrogen and other components in cellar mud.</p>","PeriodicalId":101043,"journal":{"name":"Proceedings - Soil Science Society of America","volume":"89 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hyperspectral imaging combined with optimized SPA-GA and CPO-SVR machine learning models for the rapid determination of the total nitrogen content in cellar soil\",\"authors\":\"Yifei Zhou, Jianping Tian, Xinjun Hu, Haili Yang, Liangliang Xie, Yuexiang Huang, Yuanyuan Xia, Jianheng Peng, Dan Huang\",\"doi\":\"10.1002/saj2.70095\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The fermentation of Baijiu grains in the cellar is significantly influenced by the quality of the cellar soil, which contains a diverse range of microorganisms and physicochemical components. Among these, the total nitrogen content (TNC) is a critical indicator of soil quality and thus requires real-time monitoring to ensure quality control of the Baijiu. In this study, we developed two optimized machine learning algorithms—successive projection algorithm-genetic algorithm (SPA-GA) and crown porcupine optimization (CPO) achieve the rapid and accurate detection of the TNC in cellar soil using hyperspectral imaging (HSI). The feature wavelengths were selected by combining the SPA with the GA. Subsequently, the support vector machine regression (SVR) algorithm was further optimized using the CPO algorithm to establish a prediction model for determining the TNC. Comparative analysis of the various models demonstrated that the CPO-SVR model based on the feature wavelength spectral data extracted by the SPA-GA exhibited the best performance (<span></span><math>\\n <semantics>\\n <mrow>\\n <msup>\\n <msub>\\n <mi>R</mi>\\n <mi>p</mi>\\n </msub>\\n <mn>2</mn>\\n </msup>\\n <mspace></mspace>\\n </mrow>\\n <annotation>${R_p}^{\\\\mathrm{2}}\\\\;$</annotation>\\n </semantics></math>= 0.9958, root-mean square error of prediction [RMSEP] = 0.0073 g/100 g). This model reduced the number of wavelengths by 86.16%, increased the <span></span><math>\\n <semantics>\\n <mrow>\\n <mspace></mspace>\\n <msup>\\n <msub>\\n <mi>R</mi>\\n <mi>p</mi>\\n </msub>\\n <mn>2</mn>\\n </msup>\\n </mrow>\\n <annotation>$\\\\;{R_p}^{\\\\mathrm{2}}$</annotation>\\n </semantics></math> by 0.3014, and decreased the RMSEP by 0.0566 compared to the same model built using the full-wavelength spectral data. These results indicated that the GA significantly enhanced the feature extraction capability of the SPA, thereby improving the model accuracy while reducing the number of wavelengths to reduce computational load. Furthermore, CPO was introduced to optimize the SVR, yielding the optimal parameter combination, which further improved the prediction model performance and accuracy while mitigating artificial parameter-seeking instability. HSI, in conjunction with the optimization algorithms, offers a novel method for the rapid, non-destructive detection of total nitrogen and other components in cellar mud.</p>\",\"PeriodicalId\":101043,\"journal\":{\"name\":\"Proceedings - Soil Science Society of America\",\"volume\":\"89 4\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings - Soil Science Society of America\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/saj2.70095\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings - Soil Science Society of America","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/saj2.70095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
酒窖土壤中含有多种微生物和理化成分,酒窖土壤质量对白酒籽粒的发酵有显著影响。其中,全氮含量(TNC)是土壤质量的重要指标,需要实时监测以保证白酒的质量控制。在本研究中,我们开发了两种优化的机器学习算法-连续投影算法-遗传算法(SPA-GA)和冠豪猪优化算法(CPO),实现了高光谱成像(HSI)对地窖土壤TNC的快速准确检测。结合SPA和GA选择特征波长。随后,利用CPO算法进一步优化支持向量机回归(SVR)算法,建立TNC确定的预测模型。各种模型的对比分析表明,基于SPA-GA提取的特征波长光谱数据的CPO-SVR模型表现出最好的性能(R p 2 ${R_p}^{\ maththrm {2}}\;$ = 0.9958,预测均方根误差[RMSEP] = 0.0073 g/100 g)。与使用全波长光谱数据建立的模型相比,该模型减少了86.16%的波长数,提高了0.3014的R p 2 $\;{R_p}^{\ maththrm{2}}$,降低了0.0566的RMSEP。这些结果表明,遗传算法显著增强了SPA的特征提取能力,从而提高了模型的精度,同时减少了波长数,减少了计算量。引入CPO对支持向量回归进行优化,得到最优的参数组合,进一步提高了预测模型的性能和精度,同时减轻了人为的寻参数不稳定性。HSI与优化算法相结合,为快速、无损地检测窖泥中总氮和其他成分提供了一种新方法。
Hyperspectral imaging combined with optimized SPA-GA and CPO-SVR machine learning models for the rapid determination of the total nitrogen content in cellar soil
The fermentation of Baijiu grains in the cellar is significantly influenced by the quality of the cellar soil, which contains a diverse range of microorganisms and physicochemical components. Among these, the total nitrogen content (TNC) is a critical indicator of soil quality and thus requires real-time monitoring to ensure quality control of the Baijiu. In this study, we developed two optimized machine learning algorithms—successive projection algorithm-genetic algorithm (SPA-GA) and crown porcupine optimization (CPO) achieve the rapid and accurate detection of the TNC in cellar soil using hyperspectral imaging (HSI). The feature wavelengths were selected by combining the SPA with the GA. Subsequently, the support vector machine regression (SVR) algorithm was further optimized using the CPO algorithm to establish a prediction model for determining the TNC. Comparative analysis of the various models demonstrated that the CPO-SVR model based on the feature wavelength spectral data extracted by the SPA-GA exhibited the best performance (= 0.9958, root-mean square error of prediction [RMSEP] = 0.0073 g/100 g). This model reduced the number of wavelengths by 86.16%, increased the by 0.3014, and decreased the RMSEP by 0.0566 compared to the same model built using the full-wavelength spectral data. These results indicated that the GA significantly enhanced the feature extraction capability of the SPA, thereby improving the model accuracy while reducing the number of wavelengths to reduce computational load. Furthermore, CPO was introduced to optimize the SVR, yielding the optimal parameter combination, which further improved the prediction model performance and accuracy while mitigating artificial parameter-seeking instability. HSI, in conjunction with the optimization algorithms, offers a novel method for the rapid, non-destructive detection of total nitrogen and other components in cellar mud.