{"title":"Flexible Trilinearity Alignment (FTA) and Shift Invariant Transformation (SIT) Constraints in Three-Way Multivariate Curve Resolution Data Analysis","authors":"Xin Zhang, Romà Tauler","doi":"10.1002/cem.3581","DOIUrl":"10.1002/cem.3581","url":null,"abstract":"<p>In this work, two alternative ways of analyzing three-way data with multivariate curve resolution alternating least squares (MCR-ALS) using the trilinearity constraint are described and compared. Different synthetic datasets and experimental three-way datasets covering different scenarios are analyzed, and the results obtained are compared. The two new different ways of applying the trilinearity constraint are named flexible trilinearity alignment (FTA) and shift invariant transformation (SIT). The effects of noise in the application of both types of constraints are investigated in detail. Results show that both approaches are particularly adequate for those cases like in gas chromatography and especially in liquid chromatography where the elution profiles of the same chemical component in different chromatographic runs are not totally reproducible because they are time shifted, although they preserve their shape. When strong time shifts and co-elution occur, then the “standard” trilinear model does not work, and alternative approaches should be used, such as the MCR extended bilinear model to multiset (multirun) data, or the proposed relaxation of the trilinearity constraint in the FTA and SIT methods to capture the time drift changes produced in the elution profiles of the resolved components.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3581","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141928166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel One-Class Convolutional Autoencoder Combined With Excitation–Emission Matrix Fluorescence Spectroscopy for Authenticity Identification of Food","authors":"Xiaoqin Yan, Baoshuo Jia, Wanjun Long, Kun Huang, Tong Wang, Hailong Wu, Ruqin Yu","doi":"10.1002/cem.3592","DOIUrl":"10.1002/cem.3592","url":null,"abstract":"<div>\u0000 \u0000 <p>In this work, a novel one-class classification algorithm one-class convolutional autoencoder (OC-CAE) was proposed for the detection of abnormal samples in the excitation–emission matrix (EEM) fluorescence spectra dataset. The OC-CAE used Boxplot to analyze the reconstruction errors and used the LOF algorithm to handle features extracted by the hidden layer in the convolutional autoencoder (CAE). The fused information provides the basis for more accurate pattern recognition, ensures flexibility in model training, and can obtain higher model specificity, which is important in the field of food quality control. To demonstrate the reliability and advantages of OC-CAE, two EEM cases related to the authentication of food including the Zhenjiang aromatic vinegar (ZAV) case and the camellia oil (CAO) case were studied. The results showed that OC-CAE identified all abnormal samples in the two cases, reflecting excellent performance in the detection of abnormal samples, and that it, coupled with EEM, would be an effective tool for the authenticity identification of food.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Multiplicative Scatter Correction Using Quantile Regression","authors":"Bahram Hemmateenejad, Nabiollah Mobaraki, Knut Baumann","doi":"10.1002/cem.3589","DOIUrl":"10.1002/cem.3589","url":null,"abstract":"<p>A robust method for multiplicative scatter correction (MSC) in infrared spectroscopy is presented. Using quantile regression, the outlier wavelengths (concentration-dependent wavelengths) that are irrelevant to the regression are identified and therefore excluded from the regression model. This new MCS method, which could be implemented in its simple or extended form, is much simpler than the recently proposed methods and has only one hyperparameter (the quantile value) to be adjusted. To achieve this, a scoring function based on residual analysis can automatically determine the correct quantile value. The method is first explained using simulation data sets and then its validation is explained by analysing some experimental data sets. It was found that our new method can perform well in the presence of strong outlying variables. On the other hand, when the data sets are not associated outlying wavelengths, this method behaves similarly to the conventional MSC method.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3589","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adjusted Pareto Scaling for Multivariate Calibration Models","authors":"Kurt Varmuza, Peter Filzmoser","doi":"10.1002/cem.3588","DOIUrl":"10.1002/cem.3588","url":null,"abstract":"<p>The performance of multivariate calibration models <i>ŷ</i> = f(<b><i>x</i></b>) for the prediction of a numerical property <i>y</i> from a set of <i>x</i>-variables depends on the type of scaling of the <i>x</i>-variables. Common scaling methods are autoscaling (dividing the centered <i>x</i> by its standard deviation <i>s</i>) and Pareto scaling (dividing the centered <i>x</i> by <i>s</i><sup><i>P</i></sup> with <i>P</i> = 0.5). The adjusted Pareto scaling presented here varies the exponent <i>P</i> between 0 (no scaling) and 1 (autoscaling) with the aim of obtaining an optimum prediction performance for <i>ŷ</i>. Related scaling methods based on the variable spread are range scaling and vast scaling; while level scaling is based on the location (central value) of the variable. These scaling methods and robust versions are compared for models created by partial least-squares (PLS) regression. The applied strategy repeated double cross validation (rdCV) evaluates the model performance for test set objects and considers its variability. Results with three data sets from chemistry show: (a) the efficacy of the different scaling methods depends on the data structure; (b) optimization of the Pareto exponent <i>P</i> is recommended; (c) range scaling or vast scaling may be better than adjusted Pareto scaling; (d) in general a heuristic search for the best scaling method is advisable. Overall, the consideration of different variants of scaling allow for a flexible adjustment of the variable contributions to the calibration model.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3588","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Patrizia Leogrande, Daniel Jardines, Dayamin Martinez Brito, Xavier de la Torre, Francesco Botrè, Andreas Luch, Patrick Diel, Maria Kristina Parr
{"title":"Investigation of the Physiological and Post-training Effects of Ecdysteroid Supplementation by Multivariate Analysis of the Human Serum Metabolome","authors":"Patrizia Leogrande, Daniel Jardines, Dayamin Martinez Brito, Xavier de la Torre, Francesco Botrè, Andreas Luch, Patrick Diel, Maria Kristina Parr","doi":"10.1002/cem.3594","DOIUrl":"10.1002/cem.3594","url":null,"abstract":"<div>\u0000 \u0000 <p>This work aims to characterize the serum profile of athletes after the administration of ecdysteroids, natural steroid hormones recently reported to enhance athletic performance. The combination of mass spectrometry and chemometric tools may allow to differentiate physiological effects from post-training and intake-driven effects. Serum samples were collected from 46 healthy male volunteers and divided into four groups: control (two capsules/day of Peak Ecdysone without training), placebo (two capsules without ecdysteroids with training), Ec1 (two capsules/day Peak Ecdysone with training), and Ec2 (eight capsules/day Peak Ecdysone with training). Metabolic profiling was measured using a SCIEX Triple Quadrupole LC-MS/MS system coupled with the Biocrates AbsoluteIDQ p180 kit, which allows quantitation of a large panel of metabolites that were subjected to multivariate analysis. Unsupervised analysis of the data found no significant differences between the placebo and the ecdysteroid supplementation groups. By merging Ec1 and Ec2 into a single group, coded as treated, a clear discrimination between the control and placebo groups was observed. Phosphatidylcholines were among the most significant features of ecdysteroids administration, showing a dose-dependent effect in Ec1 and Ec2 groups. As specific metabolic phenotypes can result from years of training, the discrimination of physiological effects from those caused by the administration of banned substances can be a valuable analytical strategy for the interpretation of adverse analytical findings in the anti-doping field.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adrián Gómez-Sánchez, Raffaele Vitale, Pablo Loza-Alvarez, Cyril Ruckebusch, Anna de Juan
{"title":"The MCR-ALS Trilinearity Constraint for Data With Missing Values","authors":"Adrián Gómez-Sánchez, Raffaele Vitale, Pablo Loza-Alvarez, Cyril Ruckebusch, Anna de Juan","doi":"10.1002/cem.3584","DOIUrl":"10.1002/cem.3584","url":null,"abstract":"<p>Trilinearity is a property of some chemical data that leads to unique decompositions when curve resolution or multiway decomposition methods are used. Curve resolution algorithms, such as Multivariate Curve Resolution–Alternating Least Squares (MCR-ALS), can provide trilinear models by implementing the trilinearity condition as a constraint. However, some trilinear analytical measurements, such as excitation–emission matrix (EEM) measurements, usually exhibit systematic patterns of missing data due to the nature of the technique, which imply a challenge to the classical implementation of the trilinearity constraint. In this instance, extrapolation or imputation methodologies may not provide optimal results. Recently, a novel algorithmic strategy to constrain trilinearity in MCR-ALS in the presence of missing data was developed. This strategy relies on the sequential imposition of a classical trilinearity restriction on different submatrices of the original investigated dataset, but, although effective, was found to be particularly slow and requires a proper submatrix selection criterion. In this paper, a much simpler implementation of the trilinearity constraint in MCR-ALS capable of handling systematic patterns of missing data and based on the principles of the Nonlinear Iterative Partial Least Squares (NIPALS) algorithm is proposed. This novel approach preserves the trilinearity of the retrieved component profiles without requiring data imputation or subset selection steps and, as with all other constraints designed for MCR-ALS, offers the flexibility to be applied component-wise or data block-wise, providing hybrid bilinear/trilinear models. Furthermore, it can be easily extended to cope with any trilinear or higher-order dataset with whatever pattern of missing values.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3584","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"X-Ray Computed Tomography Meets Robust Chemometric Latent Space Modeling for Lean Meat Percentage Prediction in Pig Carcasses","authors":"Puneet Mishra, Maria Font-i-Furnols","doi":"10.1002/cem.3591","DOIUrl":"10.1002/cem.3591","url":null,"abstract":"<p>This study presents a case of processing X-ray computed tomography (CT) data for pork scans using chemometric latent space modeling. The distribution of voxel intensities is shown to exemplify a multivariate, multi-collinear signal mixture. While this concept is not novel, it is revisited here from a chemometric perspective. To extract meaningful information from such multivariate signals, latent space modeling based on partial least squares (PLS) is an ideal solution. Furthermore, a robust PLS approach is even more effective for latent space modeling, as it can extract latent spaces unaffected by outliers, thereby enhancing predictive modeling. As an example, lean meat percentage is predicted using X-ray CT data and robust PLS regression. This method is applicable to X-ray CT quantification analysis, particularly in cases where unclear, erroneous, and outlying observations are suspected in the data.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3591","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anna Parshina, Anastasia Yelnikova, Valeria Shimbareva, Alla Komogorova, Polina Yurova, Irina Stenina, Olga Bobreshova, Andrey Yaroslavtsev
{"title":"Determination of Tetracaine and Oxymetazoline in Drugs and Saliva via Potentiometric Sensor Arrays Based on Fluoropolymer/Polyaniline Composites","authors":"Anna Parshina, Anastasia Yelnikova, Valeria Shimbareva, Alla Komogorova, Polina Yurova, Irina Stenina, Olga Bobreshova, Andrey Yaroslavtsev","doi":"10.1002/cem.3583","DOIUrl":"10.1002/cem.3583","url":null,"abstract":"<p>A growing interest in dental practice in intranasal anesthesia using tetracaine and oxymetazoline dictates the need for their simultaneous determination in combination drugs and human saliva. Potentiometric multisensory systems based on perfluorosulfonic acid membranes, including polyaniline-modified ones, were developed for these purposes. A change in the distribution of the sensor sensitivity to the related analytes was achieved by variation of the conditions for concentration polarization at the membrane interface with a studied solution due to a change in the intrapore volume, nature, and availability of the sorption centers, as well as the hydrophilicity of the membrane surface that were specified by the conditions for their synthesis and subsequent hydrothermal treatment. Reversibility of the analyte sorption using the chosen conditions for regeneration provided long-term stable work of both the sensors and the calibration equations established by multivariate linear regression. The membrane modification promoted their resistance to fouling. The relative errors of the simultaneous tetracaine and oxymetazoline determination in the combination drug solutions were no greater than 7% and 11%, while in the artificial saliva solutions, they were 15% and 17%, respectively, when an array of the cross-sensitive sensors based on the composite membranes prepared by different methods was used. The analysis errors were reduced to 3%–6% when analyzing the drug and to 0.2%–6% when analyzing the artificial saliva if an array was organized with the sensors based on the membrane with the dopant and the membrane without it, due to the decreasing correlation between their responses.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Keunhong Jeong, Ji Hyun Nam, Seul Lee, Jahyun Koo, Jooyeon Lee, Donghyun Yu, Seongil Jo, Jaeoh Kim
{"title":"Prediction of Flash Point of Materials Using Bayesian Kernel Machine Regression Based on Gaussian Processes With LASSO-Like Spike-and-Slab Hyperprior","authors":"Keunhong Jeong, Ji Hyun Nam, Seul Lee, Jahyun Koo, Jooyeon Lee, Donghyun Yu, Seongil Jo, Jaeoh Kim","doi":"10.1002/cem.3586","DOIUrl":"10.1002/cem.3586","url":null,"abstract":"<p>The determination of flash points is a critical aspect of chemical safety, essential for assessing explosion hazards and fire risks associated with flammable solutions. With the advent of new chemical blends and the increasing complexity of chemical waste management, the need for accurate and reliable flash point prediction methods has become more pronounced. This study introduces a novel predictive approach using Bayesian kernel machine regression (BKMR) with Gaussian process priors, designed to meet the growing demand for precise flash point estimation in the context of chemical safety. The BKMR model, underpinned by Bayesian statistics, offers a comprehensive framework that not only quantifies prediction uncertainty but also enhances interpretability amidst experimental data variability. Our comparative analysis reveals that BKMR surpasses traditional predictive models, including support vector machines, kernel ridge regression, and Gaussian process regression, in terms of accuracy and reliability across multiple metrics. By elucidating the intricate interactions between molecular features and flash point properties, the BKMR model provides profound insights into the chemical dynamics that influence flash point determinations. This study signifies a methodological leap in flash point prediction, offering a valuable tool for chemical safety analysis and contributing to the development of safer chemical handling and storage practices.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3586","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141872860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extreme Learning Machine Combined With Whale Optimization Algorithm for Spectral Quantitative Analysis of Complex Samples","authors":"Yuxia Liu, Hao Sun, Chunyan Zhao, Changkun Ai, Xihui Bian","doi":"10.1002/cem.3590","DOIUrl":"10.1002/cem.3590","url":null,"abstract":"<div>\u0000 \u0000 <p>Extreme learning machine (ELM) is combined with the discretized whale optimization algorithm (WOA) for spectral quantitative analysis of complex samples. In this method, the spectral variables selected by the discretized WOA were used to build the ELM model. Before establishing the model, the activation function and the number of hidden nodes in ELM as well as the transfer function of the discretized WOA are determined. Furthermore, the predictive performance of the full-spectrum partial least squares (PLS), ELM, and WOA-ELM models was compared with four complex sample datasets: blood, light gas oil and diesel fuels, ternary mixture, and corn samples using root mean square error of prediction (RMSEP) and correlation coefficient (R). The results show that the WOA-ELM model has the best prediction accuracy compared to full-spectrum PLS and ELM models. Therefore, the proposed method provides a novel approach for the quantitative analysis of complex samples.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141777305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}