{"title":"Quantification of metformin in pharmaceutical formulations using Vis-SWNIR hyperspectral imaging combined with multivariate curve resolution","authors":"Zahra Bolhassani , Ali Aghaei , Hadi Parastar","doi":"10.1016/j.chemolab.2025.105469","DOIUrl":"10.1016/j.chemolab.2025.105469","url":null,"abstract":"<div><div>Hyperspectral imaging (HSI), integrated with chemometrics, is a powerful tool in process analytical technology (PAT) for accurately determining the concentration of metformin hydrochloride, the active pharmaceutical ingredient (API), during the manufacturing of extended-release tablets. This study evaluated the effectiveness of visible-short wavelength near infrared (Vis-SWNIR) HSI (400–950 nm) combined with multivariate curve resolution-alternating least squares (MCR-ALS) for determining the API in a six-component mixture, which included excipients such as hydroxypropyl methylcellulose (HPMC), polyvinylpyrrolidone (PVP), and microcrystalline cellulose. Both linear and nonlinear calibration models were assessed. Partial least squares regression (PLSR) with the API concentration profile from MCR-ALS yielded promising calibration metrics, with a root mean square error of prediction (RMSEP) of 6.1 % w/w and a coefficient of determination (R<sup>2</sup>p) of 0.94, based on a set covering an API range of 0.0–70.6 % w/w. The method also demonstrated promising figures of merit (FOMs), including a limit of detection (LOD) of 4.7 % w/w and a limit of quantification (LOQ) of 14.2 % w/w, indicating its effectiveness in detecting API levels below standard thresholds. Further improvement was achieved using support vector machine (SVM) with radial basis function (RBF), enhancing RMSEP to 5.6 % w/w and R<sup>2</sup>p to 0.98, aiming to evaluate a non-linear method as a proof of concept. The study concluded that Vis-SWNIR HSI combined with chemometrics, provides an effective and non-destructive method for determining the correct API concentration in powder blends during blending, without the need for sample preparation.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105469"},"PeriodicalIF":3.7,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144289011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fangyi Xie , Xiaoxuan Chen , Yifan Jing , Mei Li , Junhui Li , Longlian Zhao
{"title":"Wine variety traceability by data fusion of near-infrared(NIR) spectroscopy and mid-infrared (MIR) spectroscopy combined with GAF and ResNet","authors":"Fangyi Xie , Xiaoxuan Chen , Yifan Jing , Mei Li , Junhui Li , Longlian Zhao","doi":"10.1016/j.chemolab.2025.105468","DOIUrl":"10.1016/j.chemolab.2025.105468","url":null,"abstract":"<div><div>Wine varietal authentication is critical for ensuring quality and preventing market fraud. Spectroscopy method is commonly used for rapid and nondestructive analysis of wine varieties, but the method typically rely on one-dimensional(1D) spectrum inputs and conventional machine learning algorithms, limiting complex compositional feature capture and reducing classification accuracy. To address these challenges, this study proposes a novel wine varietal tracing approach integrating NIR and MIR spectroscopy, Gramian Angular Field (GAF) image encoding, and ResNet-based deep learning. The NIR and MIR spectra of 172 wine samples from three grape varieties: Cabernet Sauvignon, Merlot, and Cabernet Gernischt were collected. Spectral fusion and data augmentation techniques constructed fused spectra and expanded the dataset to 1720 samples. Then the 1D NIR, MIR, and fused spectra were transformed into two-dimensional(2D) images using GAF encoding, creating Gramian Angular Difference Field (GADF) and Gramian Angular Summation Field (GASF) representations. These images were fed into a deep residual network (ResNet) for varietal classification respectively. The designed ResNet architecture incorporates residual blocks and attention mechanisms, significantly enhancing feature extraction and classification performance. Experimental results show that the proposed model achieves 100 % classification accuracy on the test set, outperforming traditional machine learning methods and 1D-CNN. The result indicates that the integrating of infrared spectroscopy, GAF image encoding, and ResNet is a feasible approach for wine varietal tracing, providing a novel solution for food quality tracing analysis.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105468"},"PeriodicalIF":3.7,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144290569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating subspace clustering techniques for effective and efficient dimensionality reduction in hyperspectral imaging","authors":"P. Shahnas , S. Malathy","doi":"10.1016/j.chemolab.2025.105463","DOIUrl":"10.1016/j.chemolab.2025.105463","url":null,"abstract":"<div><div>Hyperspectral image processing (HSI) is a critical task in remote sensing, medical imaging, and various other fields due to its capacity to capture detailed spectral information across numerous bands. However, the high dimensionality of hyperspectral data often leads to increased computational burden and complexity. This study investigates the use of subspace clustering techniques for dimensionality reduction in hyperspectral images, focusing on methods such as sparse subspace clustering (SSC), low-rank representation (LRR), and spectral clustering. These techniques are evaluated for their ability to preserve both spectral and spatial features while reducing data dimensionality. Through detailed comparison, the study finds that SSC offers superior performance in terms of classification accuracy and computational efficiency, particularly in handling the intricate patterns in high-dimensional hyperspectral image datasets. The insights gained from this analysis contribute to a better understanding of the strengths and limitations of different subspace clustering methods, providing valuable guidance for future advancements in image processing and hyperspectral data analysis.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105463"},"PeriodicalIF":3.7,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144289010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How to overfit","authors":"Rasmus Bro","doi":"10.1016/j.chemolab.2025.105461","DOIUrl":"10.1016/j.chemolab.2025.105461","url":null,"abstract":"<div><div>Overfitting remains a central challenge in modern data science, particularly as complex analytical tools become more accessible and widely applied in fields like chemometrics. This communication outlines a series of common pitfalls that lead to misleading and non-generalizable models – ranging from poor data quality and insufficient sample sizes to misuse of validation strategies and overly complex modeling choices. By illustrating a caricatured protocol for generating bad models, the paper emphasizes the importance of domain knowledge, appropriate experimental design, and rigorous validation. It advocates for “validity by design” as a proactive strategy to ensure robust, interpretable, and scientifically sound results.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105461"},"PeriodicalIF":3.7,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144298478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Friedrich Fink , Tomasz M. Stawski , Franziska Emmerling , Jana Falkenhagen
{"title":"A novel machine-learning approach to unlock technical lignin classification by NIR spectroscopy - bench to handheld","authors":"Friedrich Fink , Tomasz M. Stawski , Franziska Emmerling , Jana Falkenhagen","doi":"10.1016/j.chemolab.2025.105467","DOIUrl":"10.1016/j.chemolab.2025.105467","url":null,"abstract":"<div><div>In this research, the utilization of near-infrared (NIR) spectroscopy in conjunction with advanced machine learning methods is investigated for categorizing technical lignins obtained from different biomass sources and industrial procedures. Technical lignins, such as kraft, organosolv and lignosulfonates, have different chemical compositions, which continue to make uniform characterization and application in sustainable sectors extremely difficult. Fast, universally accessible analytics combined with data analysis is still an open question. For the first time three distinct NIR spectrometers—a high-performance benchtop system, a mid-priced compact device, and an economical handheld unit—were utilized to record NIR spectra of 31 unique lignin samples. The spectra underwent pre-processing through standard normal variate (SNV) transformation and Savitzky-Golay derivatives to amplify spectral features and decrease noise. Principal component analysis (PCA) was employed to reduce data complexity and extract crucial characteristics for classification purposes. Subsequently, four machine learning algorithms—Support Vector Machines (SVM), Gaussian Naive Bayes (GNB), Gaussian Process Classification (GPC), and Decision Tree Classification (DTC)—were implemented for the classification of the lignin samples. The DTC model exhibited the highest accuracy among them across different spectrometers. Although the benchtop spectrometer produced the most precise outcomes, the compact NeoSpectra system also displayed potential as a cost-efficient option. Nonetheless, the restricted spectral coverage of the handheld NIRONE spectrometer resulted in reduced classification accuracy. Our discoveries highlight the capability of NIR spectroscopy, combined with robust data analysis techniques, for the swift and non-destructive classification of technical lignins, facilitating their improved utilization in sustainable fields.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105467"},"PeriodicalIF":3.7,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144298477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Linqian Zhao , Junliang Shang , Xiaoqi Tang , Xiaotong Kong , Yan Sun , Jin-Xing Liu
{"title":"A mutual-guided co-attention mechanism and heterogeneous attribute graph-based framework for drug-drug interaction event prediction","authors":"Linqian Zhao , Junliang Shang , Xiaoqi Tang , Xiaotong Kong , Yan Sun , Jin-Xing Liu","doi":"10.1016/j.chemolab.2025.105440","DOIUrl":"10.1016/j.chemolab.2025.105440","url":null,"abstract":"<div><div>The combined use of multiple drugs helps alleviate patient resistance and enhances therapeutic efficacy. Nevertheless, this treatment strategy can also result in adverse side effects, which may compromise patient safety. Therefore, identifying potential drug-drug interactions (DDIs) and investigating their underlying mechanisms are of great significance. Existing methods predominantly predict whether drug pairs interact or whether drug-drug interaction events (DDIEs) occur, while few studies aim to reveal the specific risk levels of DDIEs, which are crucial for developing clinical medication strategies and personalized therapies. Based on this, we propose a DDIE risk level prediction method, named MCAHG-DDI, which integrates a mutual-guided co-attention mechanism with heterogeneous attribute graph learning. Specifically, we integrate the heterogeneous attribute graph with the SMILES sequences of drugs, leveraging a mutual-guided co-attention mechanism to extract the initial features of the drugs, which are subsequently input into a heterogeneous graph convolution network and a heterogeneous edge convolution network for advanced learning. Finally, we design a gated fusion mechanism to obtain the final embedding representations of the drugs. Experimental results demonstrate that MCAHG-DDI outperforms the baseline models in both binary and multi-class classification tasks. Ablation studies and case analyses further validate the superiority of the proposed model.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105440"},"PeriodicalIF":3.7,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144239970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An effective MW-PCP-based intermittent fault detection method via sparse matrix factorization and noise reduction","authors":"Jiayi Chen, Zhangming He, Juhui Wei, Jiongqi Wang, Xuanying Zhou","doi":"10.1016/j.chemolab.2025.105439","DOIUrl":"10.1016/j.chemolab.2025.105439","url":null,"abstract":"<div><div>Intermittent faults (IFs) usually have significant impact on systems, and the detection of IFs is faced with challenges because of their short duration and randomness. At present, IFs detection has received widespread attention, but few studies have focused on leveraging IFs’ unique characteristics. IFs exhibit sparsity in the process data matrix due to its short and limited duration. This paper proposes an effective Moving Window-Principal Component Pursuit (MW-PCP)-based IFs detection method aiming at utilizing the sparsity of IFs to accurately detect them. Firstly, PCP method is used to decompose the process data matrix and resulting in a sparse matrix that encompasses IFs and sparse process noise. Secondly, MW technique is combined with PCP to lower the interference of noise and accurately capture fault information. And then Hotelling’s <span><math><msup><mrow><mi>T</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> statistic is used to achieve efficient IFs detection. Especially, we provide the detectability analysis of IFs under the proposed method with detailed proof, including its definition and the necessary and sufficient conditions. Finally, several experiments show that the MW-PCP-based method outperforms existing methods, including Principal Component Analysis (PCA), MW-PCA, PCP, etc. Specifically, it achieved Fault Detection Rates (FDRs) of 97.3<span><math><mtext>%</mtext></math></span>, 85.8<span><math><mtext>%</mtext></math></span>, and 75<span><math><mtext>%</mtext></math></span> in numerical simulation, the CSTR process, and the Cranfield Multiphase Flow Facility dataset, respectively.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105439"},"PeriodicalIF":3.7,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144263307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
José Luis P. Calle , Tomasz Dymerski , Marta Ferreiro-González , Miguel Palma
{"title":"Smart portable electronic nose system in combination with machine learning algorithms for the intelligent discrimination of fire debris","authors":"José Luis P. Calle , Tomasz Dymerski , Marta Ferreiro-González , Miguel Palma","doi":"10.1016/j.chemolab.2025.105459","DOIUrl":"10.1016/j.chemolab.2025.105459","url":null,"abstract":"<div><div>The identification and discrimination of ignitable liquid residues (ILRs) in fire debris are crucial in forensic research for determining the intentionality of a fire. This study presents a new method using a portable sensor-based electronic nose (eNose) combined with machine learning (ML) algorithms for automated ILR classification. Six substrates (vinyl, nylon, linoleum, polyester, wood, and cotton) were burned with different ignitable liquids (gasoline, diesel, ethanol, and charcoal starter with kerosene), and samples were collected at intervals from 0 to 48 h after the fire had extinguished. Sensor responses from multiple sensors (SO<sub>2</sub>, H<sub>2</sub>S, CO, IRR, NO<sub>2</sub>, TBM, NH<sub>3</sub>, and ethanol) were collected over a duration of 140 s. The data were preprocessed using the first derivative and Savitsky-Golay filter, followed by low-level data fusion. A variable selection using the Boruta algorithm was applied, and both reduced and non-reduced matrices were used to train ML models. For detecting the presence of ILRs, random forest (RF) and support vector machine (SVM) models achieved 100 % accuracy. For discriminating between ILR types, the best performance was achieved by the RF model using the reduced matrix, correctly classifying 94.44 % of the samples. Only four sensors (SO<sub>2</sub>, H<sub>2</sub>S, CO, IRR) were necessary, indicating the potential for an optimized eNose design. This method offers advantages over traditional techniques, such as faster analysis, lower cost, and greater portability. Additionally, a web application was developed to allow users to automatically characterize fire debris using the algorithms.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105459"},"PeriodicalIF":3.7,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144239971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrés F. Ochoa-Muñoz , Javier E. Contreras-Reyes , Jaime Mosquera , Rodrigo Salas
{"title":"Partial Least Squares models under skew-normal and skew-t settings with applications","authors":"Andrés F. Ochoa-Muñoz , Javier E. Contreras-Reyes , Jaime Mosquera , Rodrigo Salas","doi":"10.1016/j.chemolab.2025.105438","DOIUrl":"10.1016/j.chemolab.2025.105438","url":null,"abstract":"<div><div>In this work, a new Partial Least Square (PLS) model based on skew-normal (SN) and skew-<span><math><mi>t</mi></math></span> (ST) distributions is proposed. This new PLS model may be of interest for applications requiring regression with an asymmetric response variable, heavy-tails, and <span><math><mi>R</mi></math></span> support. Furthermore, like PLS, the PLS-SN and PLS-ST address the multicollinearity problem by finding the PLS components that are orthogonal to each other and maximize the covariance between the response variable and PLS components. Simulation studies were conducted to compare the goodness of fit of PLS-SN and PLS-ST models versus the PLS one, using datasets with different sample sizes. Additionally, two real-world data applications were performed, where more favorable information criteria values were found with the PLS-SN and PLS-ST models compared to the PLS one.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105438"},"PeriodicalIF":3.7,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144229862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of continuous wavelet transform and a novel hybrid approach based on discrete wavelet transform with principal component analysis and fuzzy inference system for the concurrent spectrophotometric analysis of cardiovascular drugs in biological samples","authors":"Maryam Sharifi Mikal, Mahmoud Reza Sohrabi, Mandana Saber Tehrani, Saeed Mortazavi Nik","doi":"10.1016/j.chemolab.2025.105460","DOIUrl":"10.1016/j.chemolab.2025.105460","url":null,"abstract":"<div><div>In this study, chemometric-assisted UV-spectrophotometric methods using continuous wavelet transform (CWT) and discrete wavelet transform (DWT) integrated with principal component analysis (PCA) and fuzzy inference system (FIS) were developed for the simultaneous determination of losartan (LOS) and diltiazem (DIL) in binary mixtures and urine samples without any separation process. In the CWT, the best zero crossing point was obtained through the Coiflet wavelet with an order of 2 (coif2) at a wavelength of 218 nm for LOS and Daubechies wavelet with an order of 2 (db2) at a wavelength of 242 nm for DIL. The linearity range was 1–9 μg/mL for LOS, while for DIL it was 6–18 μg/mL. The LOD was 0.5924 and 1.5416 μg/mL, while the LOQ was 1.7951 and 4.5454 μg/mL for LOS and DIL, respectively. The analysis of laboratory mixtures using CWT demonstrated mean recovery values equal to 98.07 % for LOS and 99.38 % for DIL, where the root mean square error was 0.2376 for LOS and 0.2523 for DIL. In DWT, the decomposition of absorption of mixtures was performed using biorthogonal (bior1.5), db2, and Demeyer (DM) at five levels, and their outputs were reduced via PCA. Their output was dimensionally reduced through PCA to serve as the input of the FIS. The wavelet of DM with mean recovery of 100.12 % and 100.02 %, as well as RMSE of 0.0075 and 0.0105 was selected as the best wavelet for LOS and DIL, respectively. The analysis of LOS and DIL in biological samples using the suggested methods indicated RSD<1.6 % and mean recovery >94 % and their results were compared to HPLC using the ANOVA test. It can be claimed that these suggested chemometrics methods with the help of spectrophotometry are economical, quick, easy, and reliable ways in quality control laboratories as an alternative to available techniques.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105460"},"PeriodicalIF":3.7,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144212300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}