Neural Network and Extreme Gradient Boosting in Near Infrared Spectroscopy

2022 International Conference on Innovations and Development of Information Technologies and Robotics (IDITR) Pub Date : 2022-05-01 DOI:10.1109/IDITR54676.2022.9796490

K. Chia, Nur Aisyah Syafinaz Suarin

{"title":"Neural Network and Extreme Gradient Boosting in Near Infrared Spectroscopy","authors":"K. Chia, Nur Aisyah Syafinaz Suarin","doi":"10.1109/IDITR54676.2022.9796490","DOIUrl":null,"url":null,"abstract":"Near infrared spectroscopy is a secondary measurement approach that aims to quantitatively or qualitatively estimate the components of interest from the acquired near infrared spectrum using computation methods e.g. machine learning algorithms. After decades of investigation, neural network has been accepted as a nonlinear benchmark model in near infrared spectroscopy. Although a recent work reported that Extreme Gradient Boosting (XGBoost) outperformed neural network in groundwater level prediction, the optimization process and the learning algorithm of the neural network were not reported. This implies that the neural network might not be the optimal. Thus, this study aims to compare the performance of the optimal Bayesian regularized neural network and XGBoost in a regression application using more than one thousand of near infrared spectral data that were acquired throughout different years. The regression models were established to predict the dry matter content (DMC) of mangoes using the respective spectral data. Results show that even though XGBoost could achieve a satisfactory accuracy with RMSEV, RMSEP, R2V, and R2P of 1.16%, 1.22%, 0.73, and 0.80, respectively, the Bayesian regularized neural network achieved substantially better RMSEV, RMSEP, R2V, and R2P of 0.83%, 0.86%, 0.86, and 0.90, respectively. Thus, a Bayesian regularized neural network is recommended to be tested when more than one thousand near infrared spectral data were available.","PeriodicalId":111403,"journal":{"name":"2022 International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IDITR54676.2022.9796490","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Near infrared spectroscopy is a secondary measurement approach that aims to quantitatively or qualitatively estimate the components of interest from the acquired near infrared spectrum using computation methods e.g. machine learning algorithms. After decades of investigation, neural network has been accepted as a nonlinear benchmark model in near infrared spectroscopy. Although a recent work reported that Extreme Gradient Boosting (XGBoost) outperformed neural network in groundwater level prediction, the optimization process and the learning algorithm of the neural network were not reported. This implies that the neural network might not be the optimal. Thus, this study aims to compare the performance of the optimal Bayesian regularized neural network and XGBoost in a regression application using more than one thousand of near infrared spectral data that were acquired throughout different years. The regression models were established to predict the dry matter content (DMC) of mangoes using the respective spectral data. Results show that even though XGBoost could achieve a satisfactory accuracy with RMSEV, RMSEP, R2V, and R2P of 1.16%, 1.22%, 0.73, and 0.80, respectively, the Bayesian regularized neural network achieved substantially better RMSEV, RMSEP, R2V, and R2P of 0.83%, 0.86%, 0.86, and 0.90, respectively. Thus, a Bayesian regularized neural network is recommended to be tested when more than one thousand near infrared spectral data were available.

查看原文本刊更多论文

近红外光谱中的神经网络和极端梯度增强

近红外光谱是一种二次测量方法，旨在使用机器学习算法等计算方法，从获得的近红外光谱中定量或定性地估计感兴趣的成分。经过几十年的研究，神经网络已被公认为近红外光谱的非线性基准模型。虽然最近有研究报道了极端梯度增强(Extreme Gradient Boosting, XGBoost)在地下水位预测方面优于神经网络，但没有报道神经网络的优化过程和学习算法。这意味着神经网络可能不是最优的。因此，本研究的目的是比较最优贝叶斯正则化神经网络和XGBoost在一个回归应用中的性能，该应用使用了一千多个不同年份的近红外光谱数据。利用光谱数据建立了芒果干物质含量的回归模型。结果表明，尽管XGBoost在RMSEV、RMSEP、R2V和R2P分别为1.16%、1.22%、0.73和0.80的情况下可以达到令人满意的准确率，但贝叶斯正则化神经网络的RMSEV、RMSEP、R2V和R2P的准确率分别为0.83%、0.86%、0.86和0.90。因此，当有一千多个近红外光谱数据可用时，建议使用贝叶斯正则化神经网络进行测试。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)

自引率

0.00%

发文量