{"title":"DeepSMOTE with Laplacian matrix decomposition for imbalance instance fault diagnosis","authors":"Yuan Xu, Rui-Ze Fan, Yan-Lin He, Qun-Xiong Zhu, Yang Zhang, Ming-Qing Zhang","doi":"10.1016/j.chemolab.2025.105338","DOIUrl":"10.1016/j.chemolab.2025.105338","url":null,"abstract":"<div><div>In industrial environments, the unpredictability and irreproducibility of faults often result in insufficient sample sizes and atypical data features, significantly increasing the challenges faced by traditional fault diagnosis methods. To address these issues, this paper proposes a novel fault diagnosis approach that integrates the Borderline embedded deep synthetic minority oversampling technique (BE-DeepSMOTE) with Laplacian matrix decomposition, with the aim of tackling fault identification problems in imbalanced data scenarios. BE-DeepSMOTE employs a deep encoder–decoder framework to enable end-to-end learning and reconstruction of multi-dimensional features. It further incorporates the Borderline SMOTE technique to oversample minority class instances in the feature space, thereby enhancing their representation while ensuring statistical consistency with the original dataset to mitigate data imbalance. Furthermore, we introduce an ensemble classifier that combines Adaboost with Laplacian matrix decomposition. This ensemble classifier leverages the synergy of multiple weak classifiers to extract geometric properties and graph structure similarities from the data, while employing an adaptive weighting mechanism to improve the diagnostic accuracy. Experimental results from two industrial processes demonstrate that the proposed approach significantly enhances the diagnostic accuracy and stability in imbalanced instance environments.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"259 ","pages":"Article 105338"},"PeriodicalIF":3.7,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143403647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved salp swarm optimization algorithm based on a robust search strategy and a novel local search algorithm for feature selection problems","authors":"Mahdieh Khorashadizade, Elham Abbasi, Seyed Abolfazl Shahzadeh Fazeli","doi":"10.1016/j.chemolab.2025.105343","DOIUrl":"10.1016/j.chemolab.2025.105343","url":null,"abstract":"<div><div>The enormous challenge in data science and data mining for knowledge extraction is confronting an expansive high number of data dimensions. Because the process of extracting knowledge from data can become more complex and memory-consuming. Not only the presence of all features doesn't help the learning process, but also it can sometimes decrease the model's efficiency. To enhance the model's efficiency and reduce the problem's complexity, various feature selection algorithms are designed and implemented. In this paper, a novel and highly effective algorithm based on the salp swarm optimization algorithm for solving complex problems is proposed for feature selection. In the proposed method, an unexpected event that causes the chain to break apart (such as hitting an obstacle or the death of the chain leader, etc.) is modeled which is not taken into account in the salp swarm optimization algorithm. Also, the exploration capability is improved by modifying the updating the position of the chain leader. Additionally, an innovative local search algorithm has been embedded into the proposed algorithm to enhance its exploitation. The proposed approach is implemented on 14 datasets, and the results are compared by two terms, classification accuracy and number of selected features. Additionally, the effectiveness of the proposed method is tested on 2 widely used chemical datasets. The modifications that are applied to the standard salp swarm algorithm reduce the probability of getting stuck in the local optimum and simultaneously, increase the diversity of solutions. The results show that the proposed algorithm has performed significantly better than other algorithms in solving the feature selection problem.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105343"},"PeriodicalIF":3.7,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143377665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andreas Kartakoullis , Nicola Caporaso , Martin B. Whitworth , Ian D. Fisk
{"title":"Gaussian mixture model clustering allows accurate semantic image segmentation of wheat kernels from near-infrared hyperspectral images","authors":"Andreas Kartakoullis , Nicola Caporaso , Martin B. Whitworth , Ian D. Fisk","doi":"10.1016/j.chemolab.2025.105341","DOIUrl":"10.1016/j.chemolab.2025.105341","url":null,"abstract":"<div><div>In this study, an ad-hoc image processing pipeline has been developed and proposed for the purpose of semantically segmenting wheat kernel data acquired through near-infrared hyperspectral imaging (HSI). The Gaussian Mixture Model (GMM), characterized as a soft clustering method, has been employed for this task, yielding noteworthy results in both kernel and germ segmentation. A comparative analysis was conducted, wherein GMM was compared with two hard clustering methods, hierarchical clustering and k-means, as well as other common clustering algorithms prevalent in food HSI applications. Notably, GMM exhibited the highest accuracy, with a Jaccard index of 0.745, surpassing hierarchical clustering at 0.698 and k-means at 0.652. Furthermore, the spectral variations observed in wheat kernel topology can be used for semantic image segmentation, especially in the context of selecting the germ portion within the wheat kernels. These findings carry practical significance for professionals in the fields of hyperspectral imaging (HSI) and machine vision, particularly for food product quality assessment and real-time inspection.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"259 ","pages":"Article 105341"},"PeriodicalIF":3.7,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143421366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yujia Dai , Qing Ma , Tingsong Zhang , Shangyong Zhao , Lu Zhou , Xun Gao , Ziyuan Liu
{"title":"Classification of aluminum alloy using laser-induced breakdown spectroscopy combined with discriminative restricted Boltzmann machine","authors":"Yujia Dai , Qing Ma , Tingsong Zhang , Shangyong Zhao , Lu Zhou , Xun Gao , Ziyuan Liu","doi":"10.1016/j.chemolab.2025.105342","DOIUrl":"10.1016/j.chemolab.2025.105342","url":null,"abstract":"<div><div>Laser-Induced Breakdown Spectroscopy (LIBS), combined with modern machine learning tools, has emerged as a powerful technique for metal material identification, leveraging its high sensitivity and rapid response. However, the current spectral data analysis methods typically involve a two-step process of dimensionality reduction and model learning, lacking seamless integration. In this study, we address this issue by investigating a discriminative learning approach based on LIBS, utilizing the Discriminative Restricted Boltzmann Machine (DRBM). We apply LIBS technology in conjunction with DRBM for spectral feature selection and classification of five distinct small-sample aluminum alloy samples. The learned spectral latent distribution from the generative model component of DRBM effectively regularizes the discriminative process, thereby overcoming the problem of training overfitting arising from the high-dimensional small-sample limitation. This results in a stable and generalizable qualitative analysis model independent of empirical knowledge. The approach presented in this study achieves a 100 % accuracy, surpassing the best-performing traditional machine learning method (PCA-RF) by 13.33 % in accuracy and demonstrating a similar improvement compared to a Backpropagation Neural Network (BPNN) with the same structure.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105342"},"PeriodicalIF":3.7,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143349115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joan Borràs-Ferrís , Carl Duchesne , Alberto Ferrer
{"title":"A latent space-based multivariate capability index: A new paradigm for raw material supplier selection in industry 4.0","authors":"Joan Borràs-Ferrís , Carl Duchesne , Alberto Ferrer","doi":"10.1016/j.chemolab.2025.105339","DOIUrl":"10.1016/j.chemolab.2025.105339","url":null,"abstract":"<div><div>We present a novel Latent Space-based Multivariate Capability Index (<em>LSb-MC</em><sub><em>pk</em></sub>) aligned with the Quality by Design initiative and used as a criterion for ranking and selecting suppliers for a particular raw material used in a manufacturing process. The novelty of this new index is that, contrary to other multivariate capability indexes that are defined either in the raw material space or in the Critical Quality Attributes (CQAs) space of the product manufactured, this new <em>LSb-MC</em><sub><em>pk</em></sub> is defined in the latent space connecting both spaces. This endows the new index with a clear advantage over classical ones as it quantifies the capacity of each raw material supplier of providing assurance of quality with a certain confidence level for the CQAs of the manufactured product before manufacturing a single unit of the product. All we need is a rich database with historical information of several raw material properties along with the CQAs. Besides, we present a novel methodology to carry out the diagnosis for assignable causes when a supplier does not score a good capability index. The proposed <em>LSb-MC</em><sub><em>pk</em></sub> is based on Partial Least Squares (PLS) regression, and it is illustrated using data from both an industrial and a simulation study.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105339"},"PeriodicalIF":3.7,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143386824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maddina Dinesh Kumar , Dharmaiah Gurram , Se-Jin Yook , C.S.K. Raju , Nehad Ali Shah
{"title":"Optimising thermal performance of water-based hybrid nanofluids with magnetic and radiative effects over a spinning disc","authors":"Maddina Dinesh Kumar , Dharmaiah Gurram , Se-Jin Yook , C.S.K. Raju , Nehad Ali Shah","doi":"10.1016/j.chemolab.2025.105336","DOIUrl":"10.1016/j.chemolab.2025.105336","url":null,"abstract":"<div><h3>Research background and significance</h3><div>Hybrid nanofluids have garnered significant attention because of their capacity to enhance heat transmission in a range of technical applications; optimising their thermal performance is crucial for improving the efficiency of cooling systems, energy storage devices, and heat exchangers with rotating surfaces.</div></div><div><h3>Present study novelty and methodology</h3><div>In a present study investigating the heat, velocity and mass diffusion transformation under the effect of the Rossland and magnetic approximations, a ternary hybrid nanofluid is a mixing of more than two characteristics using a base fluid through a spinning disc surface, utilising to speed up the heat transmission rate due to ternary hybrid nanofluid, converting non-linear PDE to ODE in this process dimensional governing equations will convert to dimensionless by using the similarity transformations afterwards with MATLAB inbuilt BVP5C solver has been using for the numeral computation, The quadratic regression model's response surface method (RSM) has been employed to research the impacts of independent parameters on physical parameters; surface plots are drawn through Python programming.</div></div><div><h3>Quantitative evaluation</h3><div>For the RSM quadratic regression model <span><math><mrow><mo>(</mo><mrow><msup><mi>R</mi><mn>2</mn></msup><mo>=</mo><mn>99.51</mn><mo>%</mo></mrow><mo>)</mo></mrow></math></span>, it shows the model fit goodness. case-1 including more <span><math><mrow><msub><mi>C</mi><mi>f</mi></msub></mrow></math></span> rate of transmission than case-2, In case-1 with more <span><math><mrow><mi>S</mi><mi>h</mi></mrow></math></span> transmission rate in comparison to case-2, In case-1 Possessing more <span><math><mrow><mi>N</mi><mi>u</mi><mi>s</mi></mrow></math></span> rate of transmission than case 2.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105336"},"PeriodicalIF":3.7,"publicationDate":"2025-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karol I. Santoro , Yolanda M. Gómez , Héctor J. Gómez , Diego I. Gallardo
{"title":"A new class of unit models with a quantile regression approach applied to contamination data","authors":"Karol I. Santoro , Yolanda M. Gómez , Héctor J. Gómez , Diego I. Gallardo","doi":"10.1016/j.chemolab.2025.105322","DOIUrl":"10.1016/j.chemolab.2025.105322","url":null,"abstract":"<div><div>In this paper, we introduce a new class of unit models defined on the open unit interval. Through the reparameterization of the model, the location parameter can be interpreted as a quantile of the distribution. Furthermore, we can assess the impact of explanatory variables within the conditional quantiles of the dependent variable, offering an alternative to the Kumaraswamy quantile regression model. We engage in quantile regression and apply it to two instances of environmental data. We evaluate the effectiveness of the newly introduced models in scenarios both with and without covariates, drawing comparisons with results yielded by the Kumaraswamy regression model. The proposed method has been implemented in an R package.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105322"},"PeriodicalIF":3.7,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143349659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design of Poly(lactic-co-glycolic acid) nanoparticles in drug delivery by artificial intelligence methods to find the conditions of nanoparticles synthesis","authors":"Bader Huwaimel , Saad Alqarni","doi":"10.1016/j.chemolab.2025.105335","DOIUrl":"10.1016/j.chemolab.2025.105335","url":null,"abstract":"<div><div>Poly (lactic-co-glycolic acid) (PLGA) is one of the most commonly used polymers for drug delivery due to its biodegradable property. Production of PLGA particles in nanosized scale would be of great importance to exploit the properties of this polymer for nano-based drug delivery. This work explores machine learning methods for the PLGA regression tasks of particle size (nm) prediction and Zeta potential (mV) in the synthesis process. Utilizing a comprehensive dataset with categorical inputs (PLGA type and anti-solvent type) and numerical inputs (PLGA concentration and anti-solvent concentration), the research incorporates Isolation Forest for outlier detection, Min-Max Normalization, and One-Hot Encoding for preprocessing. Several regression models including LASSO, Polynomial Regression (PR), and Support Vector Regression (SVR) were employed in combination with Bagging Ensemble methods for enhanced predictive performance. Glowworm Swarm Optimization (GSO) was applied for hyperparameter tuning. The results indicate that BAG-SVR attained the highest test R<sup>2</sup> of 0.9422 for particle size prediction. For Zeta potential prediction, BAG-PR outperformed other models, achieving a test R<sup>2</sup> score of 0.98881.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105335"},"PeriodicalIF":3.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic spectral fitting for LIBS and Raman spectra by boosted deconvolution method","authors":"M.A. Meneses-Nava","doi":"10.1016/j.chemolab.2025.105334","DOIUrl":"10.1016/j.chemolab.2025.105334","url":null,"abstract":"<div><div>This study introduces a spectral analysis method known as Boosted Deconvolution Fitting (BDF) to process spectroscopic data. The BDF method enhances spectral resolution and precisely adjusts spectra by integrating boosted deconvolution for determining band profile parameters, and a multicomponent analysis technique for minor adjustments in band intensity. This technique seeks to address the shortcomings of conventional methods like the Levenberg-Marquardt algorithm (LMA), especially in terms of improving spectral resolution, accurately determining parameters of overlapping bands, and reducing sensitivity to initial conditions. The efficacy of the BDF method is affected by various factors, including the chosen band profile type (Gaussian or Lorentzian), the signal-to-noise ratio (SNR) of the dataset, and the separation and relative intensities of the spectral bands.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105334"},"PeriodicalIF":3.7,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reconstructing spectral shapes with GAN models: A data-driven approach for high-resolution spectra from low-resolution spectrometers","authors":"Min-Hsu Tai, Cheng-Che Hsu","doi":"10.1016/j.chemolab.2025.105333","DOIUrl":"10.1016/j.chemolab.2025.105333","url":null,"abstract":"<div><div>This study presents the development of a generative adversarial network (GAN) to generate high-resolution (HR) spectra from low-resolution (LR) spectra. Plasma emissions with second positive system of nitrogen are used for demonstration. Specair™ is used to generate HR and LR spectra pairs as the training data covering the range of rotational temperatures (T<sub>rot</sub>) and vibrational temperatures (T<sub>vib</sub>) ranging from 300 to 1200 K and 2000 to 6500 K, respectively. Optical emission spectra from low-pressure and atmospheric-pressure plasmas are used as the testing data to show the feasibility of the model for generating HR spectra with spectra acquired using LR spectrometers. Feature matching is used during the training stage to tackle the instability issues. The distributions of the discriminator scores are used as an initial criterion to monitor the training procedure. The results show a weighted coefficient of determination (<span><math><mrow><msup><mover><mi>R</mi><mo>‾</mo></mover><mn>2</mn></msup></mrow></math></span>) greater than 0.9999 between the simulated and generated HR spectra. The fitting errors for T<sub>rot</sub> and T<sub>vib</sub> between generated HR spectra and experimental HR spectra acquired from an HR spectrometer are mostly below 5 %. The results indicate that this GAN serves as an efficient approach to obtain HR spectra when HR spectrometers are not available.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105333"},"PeriodicalIF":3.7,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}