Dongyin Yang , Qing Tao , Ziqian Wang , Yuanhui Li , Xiaorong Luo , Xinhao Wan , Mengxin Huang , Xiang Wang , Xuecheng Wang , Zhenfeng Wu
{"title":"Data fusion strategy for rapid prediction of critical quality attributes in JianWeiXiaoShi extract during pulsed vacuum drying process based on FT-NIR and Vis/NIR-HSI","authors":"Dongyin Yang , Qing Tao , Ziqian Wang , Yuanhui Li , Xiaorong Luo , Xinhao Wan , Mengxin Huang , Xiang Wang , Xuecheng Wang , Zhenfeng Wu","doi":"10.1016/j.chemolab.2025.105451","DOIUrl":"10.1016/j.chemolab.2025.105451","url":null,"abstract":"<div><div>This study explored the feasibility of using two optical sensing methods - Fourier Transform Near-Infrared Spectroscopy (FT-NIR) and Visible/Near-Infrared Hyperspectral Imaging (Vis/NIR-HSI) - to quantitatively predict the critical quality attributes (CQAs) of JianWeiXiaoShi extract during pulsed vacuum drying (PVD) process. Additionally, a data fusion strategy was implemented to integrate the two spectral datasets, aiming to enhance the prediction accuracy and robustness of the quantitative models. Comparative analysis revealed that the FT-NIR model demonstrated higher accuracy in predicting moisture content, narirutin, and hesperidin levels, while the Vis/NIR-HSI model performed better in predicting color changes during the drying process of the extract. In addition to moisture content, the prediction model established by integrating the two spectral datasets through the data fusion strategy demonstrated more accurate predictive performance compared to single-spectrum models. Therefore, integrating FT-NIR and Vis/NIR-HSI spectral datasets through the data fusion strategy for online monitoring of quality changes during PVD of extract represents a rapid, non-destructive, and accurate approach to predict CQAs of materials. This study also provides essential technical support and valuable insights for advancing non-destructive analytical technologies in drying processes.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105451"},"PeriodicalIF":3.7,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144222922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samuel Verdú , Samuel Furones , Raúl Grau , José M. Barat , Alberto Ferrer , J.M. Prats-Montalbán
{"title":"A non-contact methodology based on imaging analysis, chemometrics, and machine learning to predict the lethality of stressors on C. elegans populations in liquid culture","authors":"Samuel Verdú , Samuel Furones , Raúl Grau , José M. Barat , Alberto Ferrer , J.M. Prats-Montalbán","doi":"10.1016/j.chemolab.2025.105450","DOIUrl":"10.1016/j.chemolab.2025.105450","url":null,"abstract":"<div><div>This work was centred on developing an objective, reproducible and non-destructive methodology to predict the lethality of <em>C. elegans</em> populations contained in liquid culture mediums, addressing the handicaps presented for imaging analysis in those media types from a numerical point of view, applying chemometric and machine learning procedures on imaging data obtained with a basic image device and processing. The experiment was carried out by taking videos from nematode populations exposed to different conditions of three stressors (hydrogen peroxide, heat and UV radiation). The processed video datasets were used as predictors for different configurations in regression methods. The dimensionality reduction approach improved the prediction capacity of the imaging information compared to the raw dataset. Moreover, the best result was achieved with a super learner model, demonstrating the synergistic effect of combining results from models with lower prediction capacity to develop a meta-model with high prediction capabilities.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105450"},"PeriodicalIF":3.7,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144204724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gaoyong Shi , Ruifang Yang , Nanjing Zhao , Gaofang Yin , Wenqing Liu
{"title":"Multifactorial analysis of fluorescence detection for soil total petroleum hydrocarbons using random forest and multiple linear regression","authors":"Gaoyong Shi , Ruifang Yang , Nanjing Zhao , Gaofang Yin , Wenqing Liu","doi":"10.1016/j.chemolab.2025.105444","DOIUrl":"10.1016/j.chemolab.2025.105444","url":null,"abstract":"<div><div>This study combined random forest (RF) and multiple linear regression (MLR) approaches to analyze the influence of various factors on the fluorescence detection of total petroleum hydrocarbons (TPH) in soil. We considered the effects of soil moisture, organic matter, and minerals, and tested samples of three common soil types and varying concentrations of soil petroleum hydrocarbons using a self-developed fluorescence imaging technology. The fluorescence signals are greatly influenced by moisture, organic matter, and minerals, exhibiting distinct effects depending on the soil types and hydrocarbon concentrations. The RF model improves accuracy and consistency by constructing decision trees, making it appropriate for non-linear and high-dimensional data scenarios, although its underperformance in our study. The MLR model provides a comprehensive understanding of the linear relationships between variables, displaying better statistical performance and consistency in most cases of our experiment, with a coefficient of determination (R<sup>2</sup>) above 0.8, and Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) all lower than those of the RF. Our research provides an important scientific basis for monitoring, evaluating, and managing soil petroleum hydrocarbon pollution, aiding in the formulation of effective soil pollution prevention strategies, and offers a foundation for further research into environmental risk assessment and soil remediation.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105444"},"PeriodicalIF":3.7,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144212466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chemometric analysis of UV–visible spectral data for the differentiation of Dalbergia latifolia and Dalbergia sissoo woods","authors":"Rohit Sharma, Rakesh Kumar","doi":"10.1016/j.chemolab.2025.105448","DOIUrl":"10.1016/j.chemolab.2025.105448","url":null,"abstract":"<div><div><strong><em>Dalbergia latifolia</em></strong> and <strong><em>Dalbergia sissoo</em></strong> woods are economically valuable due to their high-quality timber. However, the overexploitation of <strong><em>D. latifolia</em></strong> has led to the inclusion of <strong><em>D. sissoo</em></strong> along with <strong><em>D. latifolia</em></strong> in the <strong>CITES</strong> (Convention on International Trade in Endangered Species of Wild Fauna and Flora) <strong>list,</strong> which mandates regulated trade. Traditional wood identification methods, such as anatomical analysis, often fail to distinguish between these species. This study investigates the use of UV–visible spectroscopy combined with chemometric techniques - specifically principal component analysis (PCA), partial least square discriminant analysis (PLS-DA) and linear discriminant analysis (LDA) for the rapid and accurate differentiation of these two species. UV–visible spectral analysis of methanol extracts revealed distinct absorption peaks that facilitated the differentiation. The PCA, PLS-DA and LDA models demonstrated the effectiveness of this approach in distinguishing the two species' woods. This method offers a promising alternative for these <em>Dalbergia</em> species differentiation, providing a balance between speed, cost, and reliability. It is particularly valuable in situations where DNA barcoding or other high-precision techniques are impractical. The findings highlight the potential of UV–visible spectroscopy combined with multivariate analysis for timber differentiation and trade monitoring, contributing to conservation efforts.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"264 ","pages":"Article 105448"},"PeriodicalIF":3.7,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144204725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial of the XI Colloquium Chemiometricum Mediteraneum (CCM2023)","authors":"D. Ballabio, P. Facco, F. Marini","doi":"10.1016/j.chemolab.2025.105447","DOIUrl":"10.1016/j.chemolab.2025.105447","url":null,"abstract":"","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105447"},"PeriodicalIF":3.7,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144212361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Camilla Menozzi , José Manuel Prats-Montalbán , Rosalba Calvini , Alessandro Ulrici
{"title":"Comparison of colour and texture feature extraction methods to predict anthocyanins content in Sangiovese grapes","authors":"Camilla Menozzi , José Manuel Prats-Montalbán , Rosalba Calvini , Alessandro Ulrici","doi":"10.1016/j.chemolab.2025.105446","DOIUrl":"10.1016/j.chemolab.2025.105446","url":null,"abstract":"<div><div>Colour and texture are the two main sources of information contained in RGB images of food products. Different image-level approaches are available to analyse the image properties based on the extraction of colour and texture features, and the selection of the most appropriate method is a critical point, since it could significantly impact the outcomes. The present study has three main objectives. Firstly, we propose an innovative data dimensionality reduction method to extract and codify the texture features of an RGB image into a one-dimensional signal, named texturegram (TXG). Then, TXG approach is compared with different image-level feature extraction methods, such as colourgrams (CLG), Soft Colour Texture Descriptors (SCTD) and Grey Level Co-occurrence Matrices (GLCM). These techniques were used to analyse a benchmark dataset of RGB images already considered in a previous study to build Partial Least Squares (PLS) models and relate the image features with anthocyanins content of red grape samples. We also investigated the possible advantages of combining the colour and texture information brought by the different image-level techniques using data fusion. PLS models were calculated considering different partitions of the RGB image dataset into training and test set. The performances of the different models were statistically evaluated by means of Analysis of Variance (ANOVA) and Principal Component Analysis (PCA). Overall, the results suggested an interesting, even if slight, improvement of the model performances when fusing CLG and TXG, but also highlighted the hybrid nature of TXG to simultaneously explore colour and texture properties.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105446"},"PeriodicalIF":3.7,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applications of artificial intelligence and machine learning in combination with surface-enhanced Raman spectroscopy (SERS)","authors":"Hashim Jabbar , Inass Abdulah Zgair , Kamran Heydaryan , Shaymaa Awad Kadhim , Saeideh Mehmandoust , Vahid Eskandari , Hossein Sahbafar","doi":"10.1016/j.chemolab.2025.105445","DOIUrl":"10.1016/j.chemolab.2025.105445","url":null,"abstract":"<div><div>Surface-enhanced Raman spectroscopy (SERS) offers exceptional sensitivity for identifying and detecting a wide range of compounds by greatly enhancing Raman signals from molecules on metal surfaces. SERS application has been further transformed by the integration of artificial intelligence (AI) and machine learning (ML), which automates spectrum interpretation, enhances identification accuracy, and optimizes experimental settings. This paper examines current developments in the synergistic use of AI and ML with SERS in a variety of disciplines, including environmental monitoring, food safety, pathogen detection, and disease diagnosis. Studies that have been published show that these models can distinguish between analytes such as bacteria, viruses, cancer cells, and chemical substances with above 95 % accuracy. The promise of these methods is shown by the fact that some research even showed 100 % accuracy in sample identification. Food safety, environmental monitoring, and clinical diagnostics might all be revolutionized by SERS-ML techniques because of their great sensitivity, specificity, and reliability. Future research should focus on extending clinical applications, enhancing substrate capabilities and detection limitations, incorporating sophisticated machine learning techniques, and increasing the application broadness. In order to improve the robustness and practicality of these methodologies, further validation in larger cohorts and real-world contexts is also emphasized. The study demonstrates how combining AI/ML with SERS offers the potential to fundamentally change the fields of materials research, environmental monitoring, diagnostics, and other related fields.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105445"},"PeriodicalIF":3.7,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144185730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SMILES-driven machine learning for high-throughput investigation of anti-corrosion materials","authors":"Muhamad Akrom , Harun Al Azies , Wise Herowati , Totok Sutojo , Supriadi Rustad , Hermawan Kresno Dipojono , Hideaki Kasai","doi":"10.1016/j.chemolab.2025.105441","DOIUrl":"10.1016/j.chemolab.2025.105441","url":null,"abstract":"<div><div>This investigation delves into the viability of the simplified molecular input line entry system (SMILES)-based machine learning (ML) approach as the sole input feature for predicting the corrosion inhibition efficiency (CIE) of pyridine-quinoline compounds to replace various quantum chemical properties (QCP). Employing the molecular access system (MACCS) fingerprint techniques simplifies the processing of molecular structures, enhancing data efficiency. The ML algorithm, notably the gradient boosting (GB) model, showcases superior predictive capabilities, as evidenced by R<sup>2</sup> and RMSE values of 0.92 and 0.07, respectively. This outcome is akin to predictions employing 20 QCP features, yielding R<sup>2</sup> and RMSE values of 0.90 and 0.08, respectively. The study substantiates SMILES as a robust single feature for accurate CIE prediction, revealing a moderate correlation between SMILES-represented structures and CIE values. This underscores the effectiveness of SMILES-based ML in assessing corrosion inhibition potential, thereby advancing predictive modeling in corrosion science. Integrating machine learning and SMILES notation presents an efficient approach for evaluating the corrosion inhibition capacity of diverse molecular structures.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105441"},"PeriodicalIF":3.7,"publicationDate":"2025-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144137760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hamza Moussa , Farid Dahmoune , Yassine Noui , Amal Mameri , Hichem Tahraoui , Salma Menasria , Souhila Abbas , Sarah Hamid , Nourelimane Benzitoune , Abdeltif Amrane
{"title":"Optimizing dehydrated soups with advanced algorithms: A novel application of D-Optimal mixture design and NSGA-II for healthier formulations","authors":"Hamza Moussa , Farid Dahmoune , Yassine Noui , Amal Mameri , Hichem Tahraoui , Salma Menasria , Souhila Abbas , Sarah Hamid , Nourelimane Benzitoune , Abdeltif Amrane","doi":"10.1016/j.chemolab.2025.105443","DOIUrl":"10.1016/j.chemolab.2025.105443","url":null,"abstract":"<div><div>This study explores the optimization of dehydrated soup formulations to enhance their total phenolic content (TPC) and total flavonoid content (TFC) using a D-optimal mixture design and the NSGA-II (Non-dominated Sorting Genetic Algorithm II). To study the effect of the selected ingredients, A D-optimal mixture design with 15 experimental runs was employed to study the effect of selected ingredients, highlighting their significance on TPC and TFC. The NSGA-II algorithm generated 19 optimal solutions for maximizing TPC and TFC. Experimental validation of the best solution predicted TPC of 8.15 mg GAE/g dw and TFC of 1.30 mg <sub>QE</sub>/g <sub>dw</sub>, with experimental values of 10.29 mg <sub>GAE</sub>/g dw and 1.38 mg <sub>QE</sub>/g <sub>dw</sub>, confirming the model's accuracy. The optimized soup, comprised 47.70 % broccoli, 20 % celery, 20 % onion, 10 % vegetable mix, and 2.29 % salt. It exhibited significantly higher TPC and better antioxidant activity, as measured by DPPH<sup>•</sup> and ABTS<sup>•+</sup> assays. Molecular docking identified bioactive compounds with strong binding affinities to KEAP1 and BCL2 proteins, suggesting potential therapeutic applications for oxidative stress. A MATLAB interface was developed to facilitate practical application in the food industry, demonstrating the effective use of optimization techniques to create high-quality, enriched dehydrated soups and providing a foundation for future research in food formulation.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105443"},"PeriodicalIF":3.7,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiyi Ji, Chunhua Yang, Jingxiu He, Yonggang Li, Dong Li
{"title":"Deep matrix factorization considering dynamic constraints to complete missing data of complex industrial processes","authors":"Zhiyi Ji, Chunhua Yang, Jingxiu He, Yonggang Li, Dong Li","doi":"10.1016/j.chemolab.2025.105433","DOIUrl":"10.1016/j.chemolab.2025.105433","url":null,"abstract":"<div><div>In the complex industrial processes, data loss is an unavoidable issue. Due to the lengthy process flow and complex reaction mechanisms, traditional data completion methods fail to deliver satisfactory results when data loss occurs. To address this challenge, this paper proposes deep matrix factorization considering dynamic constraints (DMFDC). This algorithm combines traditional matrix factorization with artificial neural networks, leveraging the strengths of neural networks to approximate nonlinear mappings in latent variable models and utilizing all available information to minimize discrepancies between raw and generated data. Additionally, DMFDC accounts for the dynamic characteristics of the complex industrial system, employing differential operations to transform irregularly changing industrial data into a more stable sequence, thereby enabling the model to better capture data evolution patterns. This approach allows DMFDC to intelligently address the issue of missing dynamic data in the complex industrial process and to predict missing values more accurately. To evaluate its effectiveness, we conducted case studies under various missing data conditions based on a digestion dataset collected from actual alumina production sites. The results indicate that DMFDC achieves higher data completion accuracy than other methods, confirming the applicability of our approach in diverse situations involving missing data.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105433"},"PeriodicalIF":3.7,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144131160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}