Taylor R. Grimm, Kathryn B. Newhart, Amanda S. Hering
{"title":"Nonparametric Threshold Estimation of Autocorrelated Statistics in Multivariate Statistical Process Monitoring","authors":"Taylor R. Grimm, Kathryn B. Newhart, Amanda S. Hering","doi":"10.1002/cem.70004","DOIUrl":"https://doi.org/10.1002/cem.70004","url":null,"abstract":"<div>\u0000 \u0000 <p>Multivariate statistical process monitoring is commonly used to detect abnormal process behavior in real time. Multiple process variables are monitored simultaneously, and alarms are issued when monitoring statistics exceed a predetermined threshold. Traditional approaches use a parametric threshold based on the assumptions of independence and multivariate normality of the process data, which are often violated in complex processes with high sampling frequencies, leading to excessive false alarms. Some approaches for improved threshold selection have been proposed, but they assume independence of the monitoring statistics, which are often autocorrelated. In this paper, we compare the performance of nonparametric estimators for computing thresholds from autocorrelated monitoring statistics through simulation. The false alarm rate and in-control average run length of each estimator under different distributions, sample sizes, and autocorrelation levels and types are found. Estimator performance is found to depend on sample size and the strength of autocorrelation. The class of kernel density estimation (KDE) methods tends to perform better than estimators that use bootstrapping, and the proposed adjusted KDE methods that account for autocorrelation are recommended for general use. A case study to monitor a wastewater treatment facility further illustrates the performance of nonparametric and parametric thresholds when applied to real-world systems.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 2","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143248476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Naresh Pavurala, Chikkathur N. Madhavarao, Jaeweon Lee, Jayanti Das, Muhammad Ashraf, Thomas O'Connor
{"title":"Cell Culture Media and Raman Spectra Preprocessing Procedures Impact Glucose Chemometrics","authors":"Naresh Pavurala, Chikkathur N. Madhavarao, Jaeweon Lee, Jayanti Das, Muhammad Ashraf, Thomas O'Connor","doi":"10.1002/cem.70005","DOIUrl":"https://doi.org/10.1002/cem.70005","url":null,"abstract":"<div>\u0000 \u0000 <p>Deployment of process analytical technology tools such as Raman or IR spectroscopy and associated multivariate calibration models for process monitoring and control plays an important role in process automation and advanced manufacturing of pharmaceuticals. Preprocessing or preparation of the spectroscopic data is an important step in developing a multivariate calibration model. There are several ways available to preprocess the data and each may influence the calibration model performance differently. Here we investigated the influence of preprocessing procedures on the development and performance of the chemometric models to predict the glucose concentration in a bioreactor. Box–Behnken design of experiment (DOE) was used to generate the Raman spectroscopy data. Four factors were considered critical in the DOE—glucose, glutamine, glutamic acid, and antifoam concentration. Raman spectroscopy data were collected both with and without aeration conditions, independently from three cell culture media. For each medium, data consisted of calibration set (27 conditions) and model validation set (9 conditions) separately. Additionally, Raman data was also collected for certain DOE runs with increasing concentration of cell densities ranging from 0.5 × 10 E06/mL to 30 × 10 E06/mL under aerating conditions. Data from the three cell culture media were used separately to develop calibration models that used four different preprocessing procedures, namely, baseline correction (BLC), Savitzky–Golay smoothing (SGS), Savitzky–Golay derivative (SGD) and orthogonal signal correction (OSC). The preprocessing procedures were applied individually and in combinations to evaluate the calibration model parameters and the performance metrics. We further developed glucose calibration models based on partial least squares (PLS) regression with 1–3 principal components. The models developed with OSC procedure gave superior performance metrics with just one principal component across all three media. Models developed with other preprocessing procedures required two or more principal components to give comparable performance. Overall, the choice of preprocessing procedures affected the model performance.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 2","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143248478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Alignment-Agnostic Methodology for the Analysis of Designed Separations Data","authors":"Michael Sorochan Armstrong, José Camacho","doi":"10.1002/cem.70002","DOIUrl":"https://doi.org/10.1002/cem.70002","url":null,"abstract":"<div>\u0000 \u0000 <p>Chemical separations data are typically analyzed in the time domain using methods that integrate the discrete elution bands. Integrating the same chemical components across several samples must account for retention time drift over the course of an entire experiment as the physical characteristics of the separation are altered through several cycles of use. Failure to consistently integrate the components within a matrix of <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>M</mi>\u0000 <mo>×</mo>\u0000 <mi>N</mi>\u0000 </mrow>\u0000 <annotation>$$ Mtimes N $$</annotation>\u0000 </semantics></math> samples and variables creates artifacts that have a profound effect on the analysis and interpretation of the data. This work presents an alternative where the raw separations data are analyzed in the frequency domain to account for the offset of the chromatographic peaks as a matrix of complex Fourier coefficients. We present a generalization of the factorization, permutation testing, and visualization steps in ANOVA-simultaneous component analysis (ASCA) to handle complex matrices and use this method to analyze a synthetic dataset with known significant factors and compare the interpretation of a real dataset via its peak table and frequency domain representations.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 2","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143119021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Greener, Safer, and More Understandable AI for Natural Science and Technology","authors":"Harald Martens","doi":"10.1002/cem.3643","DOIUrl":"https://doi.org/10.1002/cem.3643","url":null,"abstract":"<p>More rational, open-minded use of quantitative Big Data in Science and Technology is required for better real-world problem solving as well as for the stabilization of shared belief structures in society. Modern instrumentation gives informative but overwhelming data streams. A thermal video camera with suitable spatiotemporal subspace modeling allows us to detect surface temperature changes of, for example, engines, that can reveal something going on inside. An RGB video camera responds to both motions and color changes in nature, often with spatiotemporal change patterns that we can discover and describe mathematically, validate statistically, interpret graphically, and then use for sensible things. A hyperspectral Vis./NIR satellite camera with hundreds of wavelengths reveals changes in clouds and at each earth location, again and again. Today we know how to decode such overwhelming streams of high-dimensional data into physical and chemical causalities by minimalistic hybrid multivariate subspace models. We thereby combine prior knowledge with the ability to discover new, reliable variation patterns. Minimalistic subspace models handle such data. These “open-ended” multivariate linear hybrid models are computationally fast, statistically safe, and graphically understandable. The minimalistic subspace models are therefore suitable for both data modeling (based on multivariate measurements) and metamodeling (based on input–output simulation results for nonlinear mechanistic models' behavioral repertoire). That makes it easier to combine high-dimensional streams of real-world measurements and complicated, slow mechanistic models. Implemented as minimalistic foundation models with hierarchies of extended subspace models, this can form a basis for faster discovery and problem solving in Natural Science & Technology.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 2","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3643","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143116402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of Annual Storage on Online Released Compounds in Artemisia argyi Smoke With GC-MS-Based Untargeted Metabolomics Coupled With Chemometric Method of AntDAS-GCMS","authors":"Meng Zhai, Jia-Nan Liu, Long-He Wang, Yan-Jin Wen, Hui Ma, Ping-Ping Liu, Guo-Bi Chai, Qi-Dong Zhang, Ji Ma, Yong-Jie Yu","doi":"10.1002/cem.3644","DOIUrl":"https://doi.org/10.1002/cem.3644","url":null,"abstract":"<div>\u0000 \u0000 <p><i>Artemisia argyi</i> smoke, generated from the combustion of <i>A. argyi</i>, is widely utilized in traditional Chinese medicine for moxibustion and fumigation therapies. The released smoke during the combustion of <i>A. argyi</i> is rich in massive compounds and can be affected by storing periods. However, there is a lack of a comprehensive understanding on the chemical composition of released smoke, and the effects of annual storage on the released smoke were still unclear. Herein, a strategy that integrated chromatography–mass spectrometry (GC-MS) with advanced chemometric software, AntDAS-GCMS, was developed for comprehensively characterizing tens of compound in the released smoke of <i>A. argyi</i> and evaluating the quality variation across annual storage periods. Both particle and gas phases of the released smoke during the combustion were collected for GC-MS analysis; the raw data files were then imported to our recently developed data analysis software AntDAS-GCMS for automatically retrieving underlying components. Components that show significant difference among annual storage periods were screened to provide a total of 471 components. Finally, 61 compounds were identified. Both supervised and unsupervised chemometric methods suggest that the 2- and 4-year storage periods were close to clinic used sample (3-year storage), whereas a too short storage (like 1 year) or too long storage (6 years) were quite different from the 3-year storage samples. In conclusion, this strategy provides a novel solution for evaluating smoke samples from traditional Chinese medicine.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 2","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143115100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stacking Ensemble Learning Method for Quantitative Analysis of Soluble Solid Content in Apples","authors":"Lixin Zhang, Zhensheng Huang, Xiao Zhang","doi":"10.1002/cem.3635","DOIUrl":"https://doi.org/10.1002/cem.3635","url":null,"abstract":"<div>\u0000 \u0000 <p>The soluble solids content (SSC) in apples directly affects their quality. This study aimed to detect SSC nondestructively using hyperspectral technology combined with chemometrics. However, data generation may not follow a specific pattern, and even small perturbations in the data can have a significant impact on the constructed model. To improve the anti-interference capability of individual models, this study proposed a stacking ensemble learning method that adopted partial least squares (PLS), support vector machine (SVM), extreme gradient boosting (Xgboost), random forest (RF) as basic-learners, and RF serving as a meta-learner. Experimental results showed that the performance of the established model on the test set were as follows: the root mean square error (RMSE) was 0.4325, mean absolute error (MAE) was 0.3245, mean absolute percentage error (MAPE) was 0.0271, coefficient of determination (<span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <msup>\u0000 <mrow>\u0000 <mi>R</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mn>2</mn>\u0000 </mrow>\u0000 </msup>\u0000 </mrow>\u0000 <annotation>$$ {R}^2 $$</annotation>\u0000 </semantics></math>) was 0.9250. These results indicate that the stacking ensemble learning approach could appropriately fuse the predictive results of each basic-learner and improve the prediction accuracy of individual models. To verify the superiority of the proposed stacking ensemble learning method, the selection of its basic-learners, meta-learner, and combination strategy were compared and analyzed. This study not only provides a theoretical reference for the further development of related nondestructive detection equipment but also offers guidance for fusion algorithms as well.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143114701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jimoh Olawale Ajadi, Nasir Abbas, Muhammad Riaz, Nurudeen Ayobami Ajadi, Taofeek Adeola Salami, Nurudeen A. Adegoke
{"title":"Robust Multivariate Dispersion Charts for Quality Control: Application to Sulfur Dioxide Monitoring","authors":"Jimoh Olawale Ajadi, Nasir Abbas, Muhammad Riaz, Nurudeen Ayobami Ajadi, Taofeek Adeola Salami, Nurudeen A. Adegoke","doi":"10.1002/cem.3642","DOIUrl":"https://doi.org/10.1002/cem.3642","url":null,"abstract":"<div>\u0000 \u0000 <p>This study introduces two robust multivariate Shewhart-type control charts based on grouped observations to detect changes in the covariance matrix, with a focus on monitoring sulfur dioxide levels during quality control processes. We compute the covariance matrix of observations, and apply the least absolute shrinkage and selection operator to penalize it in the in-control process. Logarithms are then applied to eigenvalues derived through singular value decomposition (SVD) of the shrunken covariance matrix, ensuring robustness to non-normality in the multivariate data. The proposed methods offer significant advantages, particularly in their ability to maintain robustness to non-normality without relying on strict distributional assumptions. Performance comparisons using the average run length demonstrate that the proposed charts exhibit superior robustness to normality assumptions compared with existing methods. However, potential limitations include the computational complexity of the shrinkage and SVD processes, which may affect the scalability of large datasets. An application to the white wine production process illustrates the effectiveness of the proposed methods for analyzing complex multivariate chemical data. These findings indicate that the introduced charts enhance the detection of shifts in the covariance matrix of physicochemical properties, thereby improving the reliability of quality control processes in non-normal environments. This study provides valuable tools for quality engineers and practitioners in industries dealing with multivariate analytical data, contributing to improved process monitoring and control, ensuring higher quality standards, and ensuring consistent product outcomes in fields such as food science and industrial chemistry.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143113942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuai Liu, Honggao Liu, Jieqing Li, Yuanzhong Wang
{"title":"Artificial and Algorithmic Screening of Infrared Spectral Feature Bands of Gastrodia elata to Achieve Rapid Identification of Its Species","authors":"Shuai Liu, Honggao Liu, Jieqing Li, Yuanzhong Wang","doi":"10.1002/cem.3641","DOIUrl":"https://doi.org/10.1002/cem.3641","url":null,"abstract":"<div>\u0000 \u0000 <p><i>Gastrodia elata</i> is a traditional Chinese medicine with medicinal and edible values. In this paper, two kinds of datasets were acquired: partial spectra (artificially obtained peak segment spectra) and full spectra (4000–400 cm<sup>−1</sup>). Competitive adaptive reweighted sampling algorithm (CARS) and successive projection algorithm (SPA) were utilized to extract the characteristic variables of the two datasets, and Partial Least Squares Discriminant Analysis (PLS-DA) models, Support Vector Machines (SVM) models, Random Forests (RF) models, and Residual convolutional neural networks (ResNet) were established. It was found that among the PLS-DA models whole-MSC-CARS-PLS-DA was optimal, with a Root Mean Square Error of Prediction (RMSEP) of 0.0658; among the SVM models Partial-Standard Normal Variable (SNV-SPA-SVM was the best, with a kernel parameter of 0.1768 and the lowest number of support vectors; among the RF models Partial-SNV-RF is optimal, but not as effective as the first two models. The loss value of the ResNet model built based on effective information is 0.001, and the model building time is short and directly uses the original data. Therefore, the ResNet model based on feature bands is the most suitable for practical application compared with other models.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143113050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Principal Component Analysis: Standardisation","authors":"Richard G. Brereton","doi":"10.1002/cem.3607","DOIUrl":"https://doi.org/10.1002/cem.3607","url":null,"abstract":"<p>Standardisation of the columns of a matrix is a common transformation prior to PCA. It can be called by different names, including autoscaling and normalisation. The latter term is confusing terminology, as it is also used for a number of other transformations, so we advise against calling this normalisation.</p><p>As standardisation is about scaling and not statistical estimation, it is best to use the definition of the population standard deviation <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <msub>\u0000 <mi>s</mi>\u0000 <mi>j</mi>\u0000 </msub>\u0000 <mo>=</mo>\u0000 <msqrt>\u0000 <mrow>\u0000 <munderover>\u0000 <mo>∑</mo>\u0000 <mrow>\u0000 <mi>i</mi>\u0000 <mo>=</mo>\u0000 <mn>1</mn>\u0000 </mrow>\u0000 <mi>I</mi>\u0000 </munderover>\u0000 <msup>\u0000 <mfenced>\u0000 <mrow>\u0000 <msub>\u0000 <mi>x</mi>\u0000 <mi>ij</mi>\u0000 </msub>\u0000 <mo>−</mo>\u0000 <msub>\u0000 <mover>\u0000 <mi>x</mi>\u0000 <mo>¯</mo>\u0000 </mover>\u0000 <mi>j</mi>\u0000 </msub>\u0000 </mrow>\u0000 </mfenced>\u0000 <mn>2</mn>\u0000 </msup>\u0000 <mo>/</mo>\u0000 <mi>I</mi>\u0000 </mrow>\u0000 </msqrt>\u0000 </mrow>\u0000 <annotation>$$ {s}_j=sqrt{sum limits_{i=1}^I{left({x}_{mathrm{ij}}-{overline{x}}_jright)}^2/I} $$</annotation>\u0000 </semantics></math> rather than the sample standard deviation.</p><p>We can now standardise each matrix. To save room, we just calculate one numerical value so that readers that are interested can check they can reproduce the results from this article. The standardised value for Dataset 1 <i>x</i><sub>83</sub> = 0.566 (Sample H, variable <i>x</i><sub>3</sub>).</p><p>Hence, whether standardisation prior to PCA is a useful technique depends on the nature of the data and the problem in hand. In some cases, it can degrade patterns, whereas in other situations it can pull out important information.</p><p>Although standardisation can make a big difference to the appearance of PC plots, in other cases, it makes little or no d","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3607","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143110775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leah Munyendo, Katharina Schuster, Wolfgang Armbruster, Majharulislam Babor, Daniel Njoroge, Yanyan Zhang, Almut von Wrochem, Alexander Schaum, Bernd Hitzmann
{"title":"Monitoring a Coffee Roasting Process Based on Near-Infrared and Raman Spectroscopy Coupled With Chemometrics","authors":"Leah Munyendo, Katharina Schuster, Wolfgang Armbruster, Majharulislam Babor, Daniel Njoroge, Yanyan Zhang, Almut von Wrochem, Alexander Schaum, Bernd Hitzmann","doi":"10.1002/cem.3638","DOIUrl":"https://doi.org/10.1002/cem.3638","url":null,"abstract":"<p>Roasting is a fundamental step in coffee processing, where complex reactions form chemical compounds related to the coffee flavor and its health-beneficial effects. These reactions occur on various time scales depending on the roasting conditions. To monitor the process and ensure reproducibility, the study proposes simple and fast techniques based on spectroscopy. This work uses analytical tools based on near-infrared (NIR) and Raman spectroscopy to monitor the coffee roasting process by predicting chemical changes in coffee beans during roasting. Green coffee beans of Robusta and Arabica species were roasted at 240°C for different roasting times. The spectra of the samples were taken using the spectrometers and modeled by the k-nearest neighbor regression (KNR), partial least squares regression (PLSR), and multiple linear regression (MLR) to predict concentrations from the spectral data sets. For NIR spectra, all the models provided satisfactory results for the prediction of chlorogenic acid, trigonelline, and DPPH radical scavenging activity with low relative root mean square error of prediction (pRMSEP < 9.649%) and high coefficient of determination (<i>R</i><sup>2</sup> > 0.915). The predictions for ABTS radical scavenging activity were reasonably good. On the contrary, the models poorly predicted the caffeine and total phenolic content (TPC). Similarly, all the models based on the Raman spectra provided good prediction accuracies for monitoring the dynamics of chlorogenic acid, trigonelline, and DPPH radical scavenging activity (pRMSEP < 7.849% and <i>R</i><sup>2</sup> > 0.944). The results for ABTS radical scavenging activity, caffeine, and TPC were similar to those of NIR spectra. These findings demonstrate the potential of Raman and NIR spectroscopy methods in tracking chemical changes in coffee during roasting. By doing so, it may be possible to control the quality of coffee in terms of its aroma, flavor, and roast level.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3638","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143114505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}