Honghong Wang , Qiong Wu , Wuye Yang , Jie Yu , Ting Wu , Zhixin Xiong , Yiping Du
{"title":"NIR and MIR spectral feature information fusion strategy for multivariate quantitative analysis of tobacco components","authors":"Honghong Wang , Qiong Wu , Wuye Yang , Jie Yu , Ting Wu , Zhixin Xiong , Yiping Du","doi":"10.1016/j.chemolab.2024.105222","DOIUrl":"10.1016/j.chemolab.2024.105222","url":null,"abstract":"<div><p>The determination of total nicotine, total sugar, reducing sugar and total nitrogen contents in tobacco is of great significance to tobacco quality evaluation and formulation design. To quickly detect the content of 4 components of tobacco, using near-infrared (NIR) and mid-infrared (MIR) spectral data from 129 solid samples of tobacco powder provided by Shanghai Tobacco Group Co., Ltd., Two NIR-MIR spectral fusion techniques are studied, that is, fusion technology 1 is to establish a model by fusing feature variables after variable selection of each spectrum. The fusion technology 2 is to first fuse the NIR-MIR spectral data and then select the variables to establish the model. Both fusion technologies use successive projections algorithm (SPA), competitive adaptive reweighted sampling (CARS), backward interval PLS (biPLS), forward interval PLS (fiPLS), synergy interval PLS (siPLS), and interval interaction moving window partial least squares (iMWPLS) algorithms to filter wavelength variables. The results showed that for total nicotine and total sugar, the PLSR model established by fusion technology method 2 combined with iMWPLS algorithm is the best, and its RMSEP decreases from 0.2314 to 1.3225 to 0.0821 and 0.8079 respectively compared with the full spectrum fusion method, which is superior to the single NIR and MIR models and NIR-MIR fusion technology 1. For reducing sugars, the simple full-spectrum fusion model has the best analytical ability and the lowest RMSEP, which is superior to the single NIR-MIR models and all models established by two spectral fusion techniques combined with six wavelength selection algorithms. For total nitrogen, the prediction effect of fusion technology 1 combined with iMWPLS algorithm model was significantly improved compared with single NIR and MIR models and NIR-MIR fusion technology 2, and its RMSEP was 0.0634. The results showed that the two NIR-MIR spectral fusion techniques made full use of the complementary information provided by NIR and MIR spectroscopy, and successfully applied them to the rapid detection of total nicotine, total sugar, reducing sugar and total nitrogen content in tobacco, which provided a new method and idea for the rapid detection of tobacco components.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"253 ","pages":"Article 105222"},"PeriodicalIF":3.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142077360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint state and process inputs estimation for state-space models with Student’s t-distribution","authors":"Hang Ci, Chengxi Zhang, Shunyi Zhao","doi":"10.1016/j.chemolab.2024.105220","DOIUrl":"10.1016/j.chemolab.2024.105220","url":null,"abstract":"<div><p>This paper proposes a joint state and unknown inputs (UIs) discrete-time estimation method for industrial processes, represented by a state-space model. To cope with the outliers in process data, the measurement noise is characterized by the Student’s t-distribution. The identification of UIs is accomplished through the recursive expectation–maximization (REM) approach. Specifically, in the E-step, a recursively calculated Q-function is formulated by the maximum likelihood criterion, and the states and the variance scale factor are estimated iteratively. In the M-step, UIs are updated analytically together with the degree of freedom is updated approximately. The effectiveness of the proposed algorithm is validated using a quadruple water tank process and a continuous stirred tank reactor. It shows that the proposed method significantly enhances the robustness and estimation accuracy of state and UIs in industrial processes, effectively handling outliers and reducing computational demands for real-time applications.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"253 ","pages":"Article 105220"},"PeriodicalIF":3.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142077361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combining algorithm techniques with mechanical and acoustic profiles for the prediction of apples sensory attributes","authors":"Riccardo Ricci , Annachiara Berardinelli , Flavia Gasperi , Isabella Endrizzi , Farid Melgani , Eugenio Aprea","doi":"10.1016/j.chemolab.2024.105217","DOIUrl":"10.1016/j.chemolab.2024.105217","url":null,"abstract":"<div><p>The research work shows the potentiality of advanced linear and nonlinear learning algorithm techniques in the prediction of apples texture sensory attributes as “hardness”, “crunchiness”, “flouriness”, “fibrousness”, and “graininess”. Starting from the information contained in the entire mechanical and acoustic curves acquired during samples compression test, the prediction performances of five different statistical tools as Partial Least Squares regression (PLS), Multilayer Perceptron (MLP), Support Vector Regression (SVR) and Gaussian Process Regression (GPR) are shown and discussed.</p><p>All Predictive models validations evidence best accuracies for texture sensory attributes “hardness” and “crunchiness” and in general for GPR learning algorithm. By combining mechanical and acoustic profiles, 5-fold cross validations produce values of coefficient of determination R<sup>2</sup> up to 0.885 (GPR) and 0.840 (GPR), respectively for “hardness” and “crunchiness”. These results, comparable to those obtained by considering a large number of mechanical and acoustic parameters extracted from acquired profiles as predictive factors, evidence a new and reliable way for the prediction of texture sensory attributes of apples. The proposed approach can overcome the necessity to define, in advance, number and type of features to be calculated from instrumental texture profiles and can be easily implemented in an automatic process.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"253 ","pages":"Article 105217"},"PeriodicalIF":3.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142049080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combination of machine learning and COSMO-RS thermodynamic model in predicting solubility parameters of coformers in production of cocrystals for enhanced drug solubility","authors":"Wael A. Mahdi , Ahmad J. Obaidullah","doi":"10.1016/j.chemolab.2024.105219","DOIUrl":"10.1016/j.chemolab.2024.105219","url":null,"abstract":"<div><p>In this study, we develop predictive models for three target variables, denoted as <span><math><mrow><msub><mi>δ</mi><mi>d</mi></msub></mrow></math></span>, <span><math><mrow><msub><mi>δ</mi><mi>p</mi></msub></mrow></math></span>, and <span><math><mrow><msub><mi>δ</mi><mi>h</mi></msub></mrow></math></span> using a dataset with 86 features and 181 samples. The response parameters, which are Hansen solubility parameters, were correlated to input parameters via several machine learning techniques. The input features are molecular descriptors of coformers which are calculated based on COMSO-RS thermodynamic model and group contribution approach. The analysis includes outlier detection via Cook's distance, normalization with a min-max scaler, and feature selection through L1-based methods. Three regression models—Gaussian Process Regression (GPR), Passive Aggressive Regression (PAR), and Polynomial Regression (PR)—are employed, with hyperparameter optimization achieved using Transient Search Optimization (TSO). The results indicate that for <span><math><mrow><msub><mi>δ</mi><mi>d</mi></msub></mrow></math></span>, the PAR model outperforms others with an R<sup>2</sup> score of 0.885, RMSE of 0.607, MAE of 0.524, and a maximum error of 1.294. The GPR model shows slightly lower performance with an R<sup>2</sup> of 0.872, RMSE of 0.816, MAE of 0.579, and a maximum error of 2.755 for <span><math><mrow><msub><mi>δ</mi><mi>d</mi></msub></mrow></math></span>. The PR model performs on <span><math><mrow><msub><mi>δ</mi><mi>d</mi></msub></mrow></math></span> with an R<sup>2</sup> of 0.814, RMSE of 0.923, MAE of 0.597, and a maximum error of 2.814. For <span><math><mrow><msub><mi>δ</mi><mi>p</mi></msub></mrow></math></span>, the GPR model provides the best performance, achieving an R<sup>2</sup> score of 0.821, RMSE of 1.693, MAE of 1.391, and a maximum error of 3.457. The PAR model performs on <span><math><mrow><msub><mi>δ</mi><mi>p</mi></msub></mrow></math></span> with an R<sup>2</sup> of 0.740, RMSE of 2.025, MAE of 1.980, and a maximum error of 6.609. Also, The PR model predicts <span><math><mrow><msub><mi>δ</mi><mi>p</mi></msub></mrow></math></span> with a R<sup>2</sup> of 0.7, RMSE of 2.329, MAE of 2.02, and maximum error of 6.366. Similarly, for <span><math><mrow><msub><mi>δ</mi><mi>h</mi></msub></mrow></math></span>, the GPR model again shows superior performance with an R<sup>2</sup> score of 0.983, RMSE of 1.243, MAE of 1.005, and a maximum error of 2.577. The PAR model also accurately predicts <span><math><mrow><msub><mi>δ</mi><mi>h</mi></msub></mrow></math></span> with a R<sup>2</sup> of 0.924, RMSE of 2.713, MAE of 2.416, and maximum error of 6.307. Additionally, the PR model predicts <span><math><mrow><msub><mi>δ</mi><mi>h</mi></msub></mrow></math></span> with a R<sup>2</sup> of 0.927, RMSE of 2.757, MAE of 2.334, and maximum error of 8.064. These results highlight the efficacy of the chosen models and optimization techniques in accurately p","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"253 ","pages":"Article 105219"},"PeriodicalIF":3.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142087063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammed Alqarni , Shaimaa Mohammed Al Harthi , Mohammed Abdullah Alzubaidi , Ali Abdullah Alqarni , Bandar Saud Shukr , Hassan Talat Shawli
{"title":"Model development using hybrid method for prediction of drug release from biomaterial matrix","authors":"Mohammed Alqarni , Shaimaa Mohammed Al Harthi , Mohammed Abdullah Alzubaidi , Ali Abdullah Alqarni , Bandar Saud Shukr , Hassan Talat Shawli","doi":"10.1016/j.chemolab.2024.105216","DOIUrl":"10.1016/j.chemolab.2024.105216","url":null,"abstract":"<div><p>A comprehensive multi-scale computational strategy was developed in this study based on mass transfer and machine learning for simulation of drug concentration distribution in a biomaterial matrix. The controlled release was modeled and validated via the hybrid model. Mass transfer equations along with kinetics models were solved numerically and the results were then used for machine learning models. We investigated the performance of three regression models, namely Decision Tree (DT), Random Forest (RF), and Extra Tree (ET) in predicting medicine concentration (C) based on r and z data. Hyper-parameter optimization is conducted using Glowworm Swarm Optimization (GSO). Results revealed high predictive accuracy across all models, with ET demonstrating superior performance, achieving a coefficient of determination value (R<sup>2</sup>) of 0.99854, an RMSE of 1.1446E-05, and a maximum error of 6.49087E-05. DT and RF also exhibit notable performance, with coefficients of determination equal to 0.99571 and 0.99655, respectively. These results highlight the effectiveness of ensemble tree-based methods in accurately predicting chemical concentrations, with Extra Tree (ET) Regression emerging as the most promising model for this specific dataset.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"253 ","pages":"Article 105216"},"PeriodicalIF":3.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142077358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust baseline correction for Raman spectra by constrained Gaussian radial basis function fitting","authors":"Sungwon Park, Hongjoong Kim","doi":"10.1016/j.chemolab.2024.105205","DOIUrl":"10.1016/j.chemolab.2024.105205","url":null,"abstract":"<div><p>Accurate baseline correction is a fundamental requirement for extracting meaningful spectral information and enabling precise quantitative analysis using Raman spectroscopy. Although numerous baseline correction techniques have been developed, they often require meticulous parameter adjustments and yield inconsistent results. To address these challenges, we have introduced a novel approach, namely constrained Gaussian radial basis function fitting (CGF). Our method involves solving a curve-fitting problem using Gaussian radial basis functions under specific constraints. To ensure stability and efficiency, we developed a linear programming algorithm for the proposed approach. We evaluated the performance of CGF using simulated Raman spectra and demonstrated its robustness across various scenarios, including changes in data length and noise levels. In contrast to standard methods, which frequently require complicated parameter adjustments and may exhibit varying errors, our approach provides a simple parameter search and consistently achieves low errors. We further assessed CGF using real Raman spectra, leading to enhanced accuracy in the quantitative analysis of the Raman spectra of chemical warfare agents. Our results emphasize the potential of CGF as a valuable tool for Raman spectroscopy data analysis, significantly advancing sophisticated analytical techniques.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"253 ","pages":"Article 105205"},"PeriodicalIF":3.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142049079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supervised and penalized baseline correction","authors":"Erik Andries , Ramin Nikzad-Langerodi","doi":"10.1016/j.chemolab.2024.105200","DOIUrl":"10.1016/j.chemolab.2024.105200","url":null,"abstract":"<div><p>Spectroscopic measurements can show distorted spectral shapes arising from a mixture of absorbing and scattering contributions. These distortions (or baselines) often manifest themselves as non-constant offsets or low-frequency oscillations. As a result, these baselines can adversely affect analytical and quantitative results. Baseline correction is an umbrella term where one applies pre-processing methods to obtain baseline spectra (the unwanted distortions) and then remove the distortions by differencing. However, current state-of-the art baseline correction methods do not utilize analyte concentrations even if they are available, or even if they contribute significantly to the observed spectral variability. We modify a class of state-of-the-art methods (<em>penalized baseline correction</em>) that easily admit the incorporation of a priori analyte concentrations such that predictions can be enhanced. This modified approach will be deemed <em>supervised and penalized baseline correction</em> (SPBC). Performance will be assessed on two near infrared data sets across both classical penalized baseline correction methods (without analyte information) and modified penalized baseline correction methods (leveraging analyte information). There are cases of SPBC that provide useful baseline-corrected signals such that they outperform state-of-the-art penalized baseline correction algorithms such as AIRPLS. In particular, we observe that performance is conditional on the correlation between separate analytes: the analyte used for baseline correlation and the analyte used for prediction—the greater the correlation between the analyte used for baseline correlation and the analyte used for prediction, the better the prediction performance.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"253 ","pages":"Article 105200"},"PeriodicalIF":3.7,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142087043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Novel investigation on adsorption analysis of safranal interacting with boron nitride and aluminum nitride fullerene-like cages: Drug delivery system","authors":"Saad M Alshahrani","doi":"10.1016/j.chemolab.2024.105206","DOIUrl":"10.1016/j.chemolab.2024.105206","url":null,"abstract":"<div><p>This study illustrates the effective control of COVID-19 infection through the adsorption of safranal (SAF) on B<sub>16</sub>N<sub>16</sub> and Al<sub>16</sub>N<sub>16</sub> fullerene-like cages. The SAF adsorption onto the B<sub>16</sub>N<sub>16</sub> and Al<sub>16</sub>N<sub>16</sub> surfaces in gas, water (H<sub>2</sub>O), and chloroform (CHCl<sub>3</sub>) environments were assessed using density functional theory (DFT) and time-dependent (TD) density functional theory methods, analyzing the substrates and their complexes. The Al<sub>16</sub>N<sub>16</sub>/SAF complex exhibited the most negative binding energy and structural stability in the water phase compared to the B<sub>16</sub>N<sub>16</sub>/SAF complex at the PBE0-D3 level. The thermodynamic parameters indicated that the adsorption of SAF onto the fullerene-like cages is exothermic, particularly for the Al<sub>16</sub>N<sub>16</sub>/SAF complex. Additionally, the interaction of SAF with the fullerene-like cages in the water phase is more pronounced than in gas and chloroform environments. The complexes' energy gap (Eg) decreases in all three environments compared to the perfect systems, with a significant reduction of over 21 % in all phases. This substantial decrease in the energy gap suggests that the complexes have increased reactivity and sensitivity to SAF, likely due to a significant change in electronic conductivity. The results of molecular docking indicate that the Al<sub>16</sub>N<sub>16</sub>/SAF complex in the water phase exhibited a strong binding affinity compared to the other compounds studied. These findings suggest that the Al<sub>16</sub>N<sub>16</sub>/SAF complex holds promise as a potential inhibitor for COVID-19 and as a valuable material for biomedical applications and drug delivery systems.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105206"},"PeriodicalIF":3.7,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142151153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mariano M. Perdomo , Luis A. Clementi , Jorge R. Vega
{"title":"Estimation of quality variables in a continuous train of reactors using recurrent neural networks-based soft sensors","authors":"Mariano M. Perdomo , Luis A. Clementi , Jorge R. Vega","doi":"10.1016/j.chemolab.2024.105204","DOIUrl":"10.1016/j.chemolab.2024.105204","url":null,"abstract":"<div><p>The first stage in the industrial production of Styrene-Butadiene Rubber (SBR) typically consists in obtaining a latex from a train of continuous stirred tank reactors. Accurate real-time estimation of some key process variables is of paramount importance to ensure the production of high-quality rubber. Monitoring the mass conversion of monomers in the last reactor of the train is particularly important. To this effect, various soft sensors (SS) have been proposed, however they have not addressed the underlying complex dynamic relationships existing among the process variables. In this work, a SS based on recurrent neural networks (RNN) is developed to estimate the mass conversion in the last reactor of the train. The main challenge is to obtain an adequate estimate of the conversion both in its usual steady-state operation and during its frequent transient operating phases. Three architectures of RNN: Elman, GRU (Gated Recurrent Unit), and LSTM (Long Short-Term Memory) are compared to critically evaluate their performances. Moreover, a comprehensive analysis is conducted to assess the ability of these models to represent different operational modes of the train. The results reveal that the GRU network exhibits the best performance for estimating the mass conversion of monomers. Then, the performance of the proposed model is compared with a previously-developed SS, which was based on a linear estimation model with a Bayesian bias adaptation mechanism and the use of Control Charts for decision-making. The model proposed here proved to be more efficient for estimating the mass conversion of monomers, particularly during transient operating phases. Finally, to evaluate the methodology utilized for designing the SS, the same RNN architectures were trained to online estimate another quality variable: the mass fraction of Styrene bound to the copolymer. The obtained results were also acceptable.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"253 ","pages":"Article 105204"},"PeriodicalIF":3.7,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142039656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biyun Yang , Zhiling Yang , Yong Xu , Wei Cheng , Fenglin Zhong , Dapeng Ye , Haiyong Weng
{"title":"A 1D-CNN model for the early detection of citrus Huanglongbing disease in the sieve plate of phloem tissue using micro-FTIR","authors":"Biyun Yang , Zhiling Yang , Yong Xu , Wei Cheng , Fenglin Zhong , Dapeng Ye , Haiyong Weng","doi":"10.1016/j.chemolab.2024.105202","DOIUrl":"10.1016/j.chemolab.2024.105202","url":null,"abstract":"<div><p>Among the most frequently diagnosed diseases in citrus, citrus Huanglongbing disease has caused severe economic losses to the citrus industry worldwide since there is no curable method and it spreads quickly. As callose accumulation in phloem is one of the early response events to Asian species <em>Candidatus</em> Liberibacter asiaticus (<em>C</em>Las) infection, the dynamic perception of the sieve plate region can be used as an indicator for the early diagnosis of citrus HLB disease. In this study, one-dimensional convolutional neural network (1D-CNN) models were established to achieve early detection of HLB disease based on spectral information in the sieve plate region using Fourier transform infrared microscopy (micro-FTIR) spectrometer. Partial least squares regression (PLSR) and the least squares support vector machine regression (LS-SVR) models are used for the prediction of callose based on the micro-FTIR information in the sieve plate region of the citrus midrib. Furthermore, an improved data augmentation method by superimposing Gaussian noise was proposed to expand the spectral amplitude. The proposed method has achieved 98.65 % classification accuracy, which was higher than that of other traditional algorithms such as the logistic model tree (LMT), linear discriminant analysis (LDA), Bayes (BS), support vector machine (SVM) and k-nearest neighbors (kNN), and also than that of the molecular detection qPCR (Quantitative real-time polymerase chain reaction) method. Finally, based on the established early detection model with laboratory samples, it can also be used to detect the citrus HLB in complex field samples by using model updating methods, and the overall detection accuracy of the model reached 91.21 %. Our approach has potential for the early diagnosis of citrus HLB disease from the microscopic scale, which would provide useful and precise guidelines to prevent and control citrus HLB disease.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"252 ","pages":"Article 105202"},"PeriodicalIF":3.7,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141992929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}