{"title":"X-Ray Computed Tomography Meets Robust Chemometric Latent Space Modeling for Lean Meat Percentage Prediction in Pig Carcasses","authors":"Puneet Mishra, Maria Font-i-Furnols","doi":"10.1002/cem.3591","DOIUrl":"10.1002/cem.3591","url":null,"abstract":"<p>This study presents a case of processing X-ray computed tomography (CT) data for pork scans using chemometric latent space modeling. The distribution of voxel intensities is shown to exemplify a multivariate, multi-collinear signal mixture. While this concept is not novel, it is revisited here from a chemometric perspective. To extract meaningful information from such multivariate signals, latent space modeling based on partial least squares (PLS) is an ideal solution. Furthermore, a robust PLS approach is even more effective for latent space modeling, as it can extract latent spaces unaffected by outliers, thereby enhancing predictive modeling. As an example, lean meat percentage is predicted using X-ray CT data and robust PLS regression. This method is applicable to X-ray CT quantification analysis, particularly in cases where unclear, erroneous, and outlying observations are suspected in the data.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3591","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anna Parshina, Anastasia Yelnikova, Valeria Shimbareva, Alla Komogorova, Polina Yurova, Irina Stenina, Olga Bobreshova, Andrey Yaroslavtsev
{"title":"Determination of Tetracaine and Oxymetazoline in Drugs and Saliva via Potentiometric Sensor Arrays Based on Fluoropolymer/Polyaniline Composites","authors":"Anna Parshina, Anastasia Yelnikova, Valeria Shimbareva, Alla Komogorova, Polina Yurova, Irina Stenina, Olga Bobreshova, Andrey Yaroslavtsev","doi":"10.1002/cem.3583","DOIUrl":"10.1002/cem.3583","url":null,"abstract":"<p>A growing interest in dental practice in intranasal anesthesia using tetracaine and oxymetazoline dictates the need for their simultaneous determination in combination drugs and human saliva. Potentiometric multisensory systems based on perfluorosulfonic acid membranes, including polyaniline-modified ones, were developed for these purposes. A change in the distribution of the sensor sensitivity to the related analytes was achieved by variation of the conditions for concentration polarization at the membrane interface with a studied solution due to a change in the intrapore volume, nature, and availability of the sorption centers, as well as the hydrophilicity of the membrane surface that were specified by the conditions for their synthesis and subsequent hydrothermal treatment. Reversibility of the analyte sorption using the chosen conditions for regeneration provided long-term stable work of both the sensors and the calibration equations established by multivariate linear regression. The membrane modification promoted their resistance to fouling. The relative errors of the simultaneous tetracaine and oxymetazoline determination in the combination drug solutions were no greater than 7% and 11%, while in the artificial saliva solutions, they were 15% and 17%, respectively, when an array of the cross-sensitive sensors based on the composite membranes prepared by different methods was used. The analysis errors were reduced to 3%–6% when analyzing the drug and to 0.2%–6% when analyzing the artificial saliva if an array was organized with the sensors based on the membrane with the dopant and the membrane without it, due to the decreasing correlation between their responses.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Keunhong Jeong, Ji Hyun Nam, Seul Lee, Jahyun Koo, Jooyeon Lee, Donghyun Yu, Seongil Jo, Jaeoh Kim
{"title":"Prediction of Flash Point of Materials Using Bayesian Kernel Machine Regression Based on Gaussian Processes With LASSO-Like Spike-and-Slab Hyperprior","authors":"Keunhong Jeong, Ji Hyun Nam, Seul Lee, Jahyun Koo, Jooyeon Lee, Donghyun Yu, Seongil Jo, Jaeoh Kim","doi":"10.1002/cem.3586","DOIUrl":"10.1002/cem.3586","url":null,"abstract":"<p>The determination of flash points is a critical aspect of chemical safety, essential for assessing explosion hazards and fire risks associated with flammable solutions. With the advent of new chemical blends and the increasing complexity of chemical waste management, the need for accurate and reliable flash point prediction methods has become more pronounced. This study introduces a novel predictive approach using Bayesian kernel machine regression (BKMR) with Gaussian process priors, designed to meet the growing demand for precise flash point estimation in the context of chemical safety. The BKMR model, underpinned by Bayesian statistics, offers a comprehensive framework that not only quantifies prediction uncertainty but also enhances interpretability amidst experimental data variability. Our comparative analysis reveals that BKMR surpasses traditional predictive models, including support vector machines, kernel ridge regression, and Gaussian process regression, in terms of accuracy and reliability across multiple metrics. By elucidating the intricate interactions between molecular features and flash point properties, the BKMR model provides profound insights into the chemical dynamics that influence flash point determinations. This study signifies a methodological leap in flash point prediction, offering a valuable tool for chemical safety analysis and contributing to the development of safer chemical handling and storage practices.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3586","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141872860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extreme Learning Machine Combined With Whale Optimization Algorithm for Spectral Quantitative Analysis of Complex Samples","authors":"Yuxia Liu, Hao Sun, Chunyan Zhao, Changkun Ai, Xihui Bian","doi":"10.1002/cem.3590","DOIUrl":"10.1002/cem.3590","url":null,"abstract":"<div>\u0000 \u0000 <p>Extreme learning machine (ELM) is combined with the discretized whale optimization algorithm (WOA) for spectral quantitative analysis of complex samples. In this method, the spectral variables selected by the discretized WOA were used to build the ELM model. Before establishing the model, the activation function and the number of hidden nodes in ELM as well as the transfer function of the discretized WOA are determined. Furthermore, the predictive performance of the full-spectrum partial least squares (PLS), ELM, and WOA-ELM models was compared with four complex sample datasets: blood, light gas oil and diesel fuels, ternary mixture, and corn samples using root mean square error of prediction (RMSEP) and correlation coefficient (R). The results show that the WOA-ELM model has the best prediction accuracy compared to full-spectrum PLS and ELM models. Therefore, the proposed method provides a novel approach for the quantitative analysis of complex samples.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141777305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nondestructive Identification of Wheat Seed Variety and Geographical Origin Using Near-Infrared Hyperspectral Imagery and Deep Learning","authors":"Apurva Sharma, Tarandeep Singh, Neerja Mittal Garg","doi":"10.1002/cem.3585","DOIUrl":"10.1002/cem.3585","url":null,"abstract":"<div>\u0000 \u0000 <p>Seed purity assurance is an important aspect of maintaining the quality standards of wheat seeds. It relies significantly on quality parameters, like varietal classification and geographical origin identification. Hyperspectral imaging (HSI) has emerged as an advanced nondestructive technique to determine various quality parameters. In recent years, several studies have utilized HSI for varietal classification, although a limited number of varieties were considered. Additionally, no attention has been paid to determining the geographical origin of wheat seeds. To address these gaps, two separate experiments were performed for varietal classification and geographical origin identification. The seeds from 96 varieties grown across 5 different agricultural regions in India were collected. Hyperspectral images of wheat seeds were acquired in the wavelength ranging 900–1700 nm. The spectral reflectance values were obtained from the region of interest (ROI) corresponding to each seed. Subsequently, the deep learning models (convolutional neural networks [CNNs]) were established and compared with two conventional algorithms, including support vector machines (SVMs) and K-nearest neighbors (KNNs). The experimental results indicated that the proposed CNN models outperformed the SVM and KNN models, achieving an overall accuracy of 94.88% and 99.02% for varietal classification and geographical origin identification, respectively. These results demonstrate that HSI combined with deep learning has the potential to accurately classify a large number of wheat varieties. Moreover, HSI can be used to precisely identify the geographical origins of wheat seeds. This study provides an accurate and nondestructive method that can assist in breeding, quality evaluation, and the development of high-quality wheat seeds.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
O. Y. Rodionova, N. I. Kurysheva, G. A. Sharova, A. L. Pomerantsev
{"title":"Novelty and Similarity: Detection Using Data-Driven Soft Independent Modeling of Class Analogy","authors":"O. Y. Rodionova, N. I. Kurysheva, G. A. Sharova, A. L. Pomerantsev","doi":"10.1002/cem.3587","DOIUrl":"10.1002/cem.3587","url":null,"abstract":"<div>\u0000 \u0000 <p>Novelty and similarity are complex concepts that have numerous applications in various fields, including biology and medicine. Novelty detection is a technique used to determine whether a dataset is different from another dataset considered as a standard. Similarity detection is a technique used to determine whether two datasets belong to the same population. Novelty and similarity are closely related concepts; however, they are not complementary. Novelty is a much more popular one, and there are many publications about it. Similarity is, in fact, a new concept that has not yet been explored in depth. Classical statistics offers a large number of tools suitable for detection of similarity, mostly in the univariate case. At the same time, this topic has been insufficiently studied in the field of machine learning. This paper suggests several principles which are important for this research and also present a method for the detection of both novelty and similarity. The method uses a one-class classifier, known as Data-Driven Soft Independent Modeling of Class Analogy (DD-SIMCA). Three examples illustrate our approach. The first one uses simulated data and demonstrates the performance of DD-SIMCA for the detection of novelty. The second example uses a real-world data and studies similarity of two groups of patients who participate in the evaluation of the effectiveness of the treatment of primary angle-closure glaucoma. The third example comes from medical diagnostics. This is a real-world publicly available data used for comparison of various classification algorithms.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Morten A. Rasmussen, Bekzod Khakimov, Jasper Engel, Jeroen Jansen
{"title":"Permutation Strategies for Inference in ANOVA-Based Models for Nonorthogonal Designs Including Continuous Covariates","authors":"Morten A. Rasmussen, Bekzod Khakimov, Jasper Engel, Jeroen Jansen","doi":"10.1002/cem.3580","DOIUrl":"10.1002/cem.3580","url":null,"abstract":"<p>Analysis of variance and linear models is undoubtedly one of the most useful statistical contributions to experimental and observational science. With the ability to characterize a system through multivariate responses, these methods have emerged to be general tools regardless of response dimensionality. Contemporary methods for establishing statistical inference, such as ANOVA simultaneous component analysis (ASCA), are based on Monte Carlo sampling; however, a flat uniform resampling scheme may violate the structure of the uncertainty for unbalanced designs as well as for observational data. In this work, we provide permutation strategies for inferential testing for unbalanced designs including interaction models and establish nonuniform randomization based on the concept of propensity score matching. Lastly, we provide a general method for modelling continuous covariates based on kernel smoothers. All methods are characterized on their ability to provide unbiased Type I error results.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3580","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comprehensive tutorial on data-driven SIMCA: Theory and implementation in web","authors":"Sergey Kucheryavskiy, Oxana Rodionova, Alexey Pomerantsev","doi":"10.1002/cem.3560","DOIUrl":"https://doi.org/10.1002/cem.3560","url":null,"abstract":"","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 7","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141597073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ATR-FTIR Spectroscopy Preprocessing Technique Selection for Identification of Geographical Origins of Gastrodia elata Blume","authors":"Hong Liu, Honggao Liu, Jieqing Li, Yuanzhong Wang","doi":"10.1002/cem.3579","DOIUrl":"10.1002/cem.3579","url":null,"abstract":"<div>\u0000 \u0000 <p><i>Gastrodia elata</i> Blume from different regions varies in growth conditions, soil types, and climate, which directly affects the content and quality of its medicinal components. Accurately identifying the origin can effectively ensure the medicinal value of <i>G. elata</i> Bl., prevent the circulation of counterfeit products, and thus protect the interests and health of consumers. Attenuated total reflectance Fourier transform infrared (ATR-FTIR) spectroscopy is a rapid and effective method for verifying the authenticity of traditional Chinese medicines. However, the presence of scattering effects in the spectra poses challenges in establishing reliable discrimination models. Therefore, employing appropriate scattering correction techniques is crucial for improving the quality of spectral data and the accuracy of discrimination models. This study uses two ensemble preprocessing approaches; the first type is series fusion of scatter correction technologies (SCSF), and another method is sequential preprocessing through orthogonalization (SPORT). Four discriminant models were established using a single scattering correction technique and two ensemble preprocessing approaches. The results show that the data-driven version of the soft independent modeling of class analogy (DD-SIMCA) model built based on multiplicative scatter correction (MSC) preprocessing has a sensitivity of 0.98 and a specificity of 0.91, able to effectively distinguish whether a sample of <i>G. elata</i> Bl. originates from Zhaotong. In addition, three discriminant models including support vector machine (SVM), partial least squares discriminant analysis (PLS-DA), and three gradient boosting machine (GBM) algorithms built using the ensemble preprocessing approach have good classification and generalization capabilities. Among them, the SCSF-PLS-DA model has the best performance with 99.68% and 98.08% accuracy for the training and test sets, respectively, and F1 of 0.97; the SPORT-SVM model achieved the second-best classification ability. The results show that the ensemble preprocessing approach used can improve the success rate of <i>G. elata</i> Bl. geographical origin classification.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Firefly Interval Selection Combined With Extreme Learning Machine for Spectral Quantification of Complex Samples","authors":"Shuyu Wang, Xudong Zhang, Prisca Mpango, Hao Sun, Xihui Bian","doi":"10.1002/cem.3578","DOIUrl":"10.1002/cem.3578","url":null,"abstract":"<div>\u0000 \u0000 <p>Firefly algorithm (FA) combined with extreme learning machine (ELM) is developed for spectral interval selection and quantitative analysis of complex samples. The method firstly segments the spectra into a certain number of intervals. Vectors with 1 and 0, which represent the interval selected or not, are used as the inputs of the FA. The RMSEP value predicted by ELM model is used as the fitness function of the FA. The activation function and number of hidden layer nodes of ELM, number of spectral intervals, population number, environmental absorbance, and constant of FA are optimized. The predictive performance of FA-ELM is compared with full-spectrum PLS, ELM, genetic algorithm-ELM (GA-ELM), and particle swarm optimization-ELM (PSO-ELM) by one ultraviolet (UV) spectrum dataset of gasoil and three near-infrared (NIR) spectral datasets of corn, wheat, and tablet samples. The results show that FA-ELM has a better performance compared with its competitors in predicting monoaromatics, water, wheat kernel texture, and active pharmaceutical ingredients (APIs) in gasoil, corn, wheat, and tablet samples.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 9","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}