Chemometrics and Intelligent Laboratory Systems最新文献

筛选
英文 中文
Full-spectrum LIBS quantitative analysis based on heterogeneous ensemble learning model 基于异构集成学习模型的全谱LIBS定量分析
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-01-13 DOI: 10.1016/j.chemolab.2025.105321
Xinyue Fang , Haoyang Yu , Qian Huang , Zhaohui Jiang , Dong Pan , Weihua Gui
{"title":"Full-spectrum LIBS quantitative analysis based on heterogeneous ensemble learning model","authors":"Xinyue Fang ,&nbsp;Haoyang Yu ,&nbsp;Qian Huang ,&nbsp;Zhaohui Jiang ,&nbsp;Dong Pan ,&nbsp;Weihua Gui","doi":"10.1016/j.chemolab.2025.105321","DOIUrl":"10.1016/j.chemolab.2025.105321","url":null,"abstract":"<div><div>Laser-induced breakdown spectroscopy (LIBS) technology is widely used in fields such as analytical chemistry, materials science, and environmental monitoring. Modeling the quantitative relationship between component contents and spectral data is a key step in LIBS analysis. However, traditional regression methods commonly use individual regression model, which are difficult to comprehensively and reasonably utilize the information in the spectra, resulting in limitations in full-spectrum multicomponent regression. This paper proposes a heterogeneous ensemble learning (HEL) model, selecting four heterogeneous sub-models: CNN, Lasso, Boosting, and FNN, for full-spectrum LIBS quantitative regression analysis. HEL can fully leverage the strengths of different models by using Bayesian weighting strategy, thereby improving the performance of LIBS quantitative analysis. Experimental results show that the proposed HEL regression model has better accuracy and stability compared to the existing models.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"257 ","pages":"Article 105321"},"PeriodicalIF":3.7,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143155129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimum RBM encoded SVM model with ensemble feature Extractor-based plant disease prediction 基于集合特征的最优RBM编码支持向量机模型
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-01-11 DOI: 10.1016/j.chemolab.2025.105319
Piyush Sharma, Devi Prasad Sharma, Sulabh Bansal
{"title":"Optimum RBM encoded SVM model with ensemble feature Extractor-based plant disease prediction","authors":"Piyush Sharma,&nbsp;Devi Prasad Sharma,&nbsp;Sulabh Bansal","doi":"10.1016/j.chemolab.2025.105319","DOIUrl":"10.1016/j.chemolab.2025.105319","url":null,"abstract":"<div><div>In agricultural technology, accurate and speedy plant disease identification is essential to maintain the optimum crop quality and output. This research proposed a system that can automatically diagnose diseases in apple fruit and apple trees using machine learning (ML) image processing. Thus, this research offers a novel approach for accurate plant disease prediction by combining an Ensemble Feature Extractor with an Optimum Restricted Boltzmann Machine (RBM) Encoded Support Vector Machine (SVM) model. The model uses RBM-encoded features and SVM classification, and several feature extraction techniques enhance it. The experiments across the PDD271 dataset with 220,592 images and 271 categories demonstrate the model's outstanding classification performance, stressing its potential to develop agricultural technology and enable early disease diagnosis for better crop management. Consequently, with respective values of 98 %, 98 %, 89.7 %, and 97.8 %, the model may give more successful outcomes regarding accuracy, precision, recall, and F1 Score.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105319"},"PeriodicalIF":3.7,"publicationDate":"2025-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fused LassoNet: Sequential feature selection for spectral data with neural networks 融合LassoNet:基于神经网络的光谱数据序列特征选择
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-01-11 DOI: 10.1016/j.chemolab.2024.105315
Chaeyun Yeo , Namjoon Suh , Younghoon Kim
{"title":"Fused LassoNet: Sequential feature selection for spectral data with neural networks","authors":"Chaeyun Yeo ,&nbsp;Namjoon Suh ,&nbsp;Younghoon Kim","doi":"10.1016/j.chemolab.2024.105315","DOIUrl":"10.1016/j.chemolab.2024.105315","url":null,"abstract":"<div><div>Feature selection for high-dimensional spectral data is critical to improve the accuracy and interpretability of chemometric models. Various methods for feature selection have been introduced in chemometrics; however, achieving explainable sequential feature selection while conducting nonlinear classification simultaneously remains challenging. To address the challenge, this study proposes a fused least absolute shrinkage and selection operator network (LassoNet) that integrates the regularization principles of both the LassoNet and fused Lasso within the framework of a neural network. Further, the fused Lasso method facilitates continuous feature selection by considering the sequence between features, whereas LassoNet method enables nonlinear modeling using neural networks. We solve the fused LassoNet problem with proximal gradient descent, and the optimality of the proximal operator is mathematically proved. This study analyzes the performances of Lasso, fused Lasso, LassoNet, and fused LassoNet in classifying two groups using nine spectral datasets. The fused LassoNet demonstrates superior performance in terms of classification accuracy and sequential feature selection. These results demonstrate the proposed method enhances the predictive accuracy and interpretability of chemometric models using spectral data.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"257 ","pages":"Article 105315"},"PeriodicalIF":3.7,"publicationDate":"2025-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143155128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advanced feature analysis for enhancing cocrystal prediction 用于增强共晶预测的高级特征分析
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-01-10 DOI: 10.1016/j.chemolab.2025.105318
Alessandro Cossard , Chiara Sabena , Gianluca Bianchini , Emanuele Priola , Roberto Gobetto , Andrea Aramini , Michele R. Chierotti
{"title":"Advanced feature analysis for enhancing cocrystal prediction","authors":"Alessandro Cossard ,&nbsp;Chiara Sabena ,&nbsp;Gianluca Bianchini ,&nbsp;Emanuele Priola ,&nbsp;Roberto Gobetto ,&nbsp;Andrea Aramini ,&nbsp;Michele R. Chierotti","doi":"10.1016/j.chemolab.2025.105318","DOIUrl":"10.1016/j.chemolab.2025.105318","url":null,"abstract":"<div><div>The design of novel pharmaceutical crystal forms, including molecular salts and cocrystals, has gained significant attention from pharmaceutical companies due to their ability to modulate key physicochemical and biopharmaceutical properties. The selection of appropriate coformers for cocrystallization, however, remains a challenge, typically relying on labor-intensive trial-and-error methods. This study introduces <em>FeatureMaster</em>, a tool designed to evaluate the representativeness of training sets relative to test sets, thereby enhancing the reliability of machine learning models in predicting cocrystallization outcomes. We employed four key algorithms — feature overlap, quartiles, Cohen's D, and p-value analysis — to <em>a priori</em> assess the predictive accuracy. The efficacy of these methods was evaluated on two systems: piracetam (PRC) and pyridoxine (PN). The test set data were collected from in-house experiments: the PRC and PN test sets were experimentally created with a series of coformers (20 for PRC and 14 for PN) using different synthetic techniques. The experimental tests lead to the formation of 3 new cocrystals for PRC (with quercetin, 2-ketoglutaric acid, and malic acid) and 7 new molecular salts for PN (with 2-ketoglutaric acid, pimelic acid, cinnamic acid, gallic acid, N-acetylcysteine, and caffeic acid). Training sets were collected from literature and features calculated using Hansen Solubility Parameters (HSP), Hydrogen Bond Energy (HBE), Molecular Complementarity (MC), and Quantitative Structure-Activity Relationship (QSAR) methods. Models were developed using the Random Forest algorithm, known for its robustness in handling complex datasets. Our results demonstrate that statistical analyses using overlap, Cohen's D and p-values are fundamental for improving the prediction and for providing <em>a priori</em> insights into the model's reliability. This approach reduces the experimental tests and resource consumption in the cocrystal screening process, offering a promising strategy for future pharmaceutical development.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"257 ","pages":"Article 105318"},"PeriodicalIF":3.7,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143155130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GCNFG-DTA:Screening natural medicinal components of Cyperus esculentus targeting kinases with AIDD methods GCNFG-DTA:用AIDD方法筛选沙柏靶向激酶的天然药物成分
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-01-04 DOI: 10.1016/j.chemolab.2025.105317
Haiqing Sun , Xuecong Tian , Zhuman Wen , Sizhe Zhang , Yaxuan Yang , Yixian Tu , Xiaoyi Lv
{"title":"GCNFG-DTA:Screening natural medicinal components of Cyperus esculentus targeting kinases with AIDD methods","authors":"Haiqing Sun ,&nbsp;Xuecong Tian ,&nbsp;Zhuman Wen ,&nbsp;Sizhe Zhang ,&nbsp;Yaxuan Yang ,&nbsp;Yixian Tu ,&nbsp;Xiaoyi Lv","doi":"10.1016/j.chemolab.2025.105317","DOIUrl":"10.1016/j.chemolab.2025.105317","url":null,"abstract":"<div><div>Screening bioactive molecules from natural plant compounds is currently a common approach in the field of drug discovery. <em>Cyperus esculentus</em>, a multipurpose crop primarily used for food, is highly valued in certain countries or regions for its unique medicinal properties. Although there is a foundational understanding of its components and pharmacological effects, exploration of its effective targets, especially kinase targets, remains insufficient. Our study integrates Artificial Intelligence-Assisted Drug Design (AIDD) by utilizing the KIBA and BindingDB datasets to train the GCNFG-DTA deep learning model for predicting the kinase target affinity of 152 active compounds from <em>Cyperus esculentus</em>. By screening for high-affinity molecule-kinase target pairs and employing molecular docking and molecular dynamics simulations, the study successfully identified pairs of the most promising active molecule-target combinations. Our predicting results demonstrate that the GCN-GAT-FG model, with its excellent predictive ability (Achieving a low MSE of 0.131 and a high CI of 0.896), significantly accelerates the discovery process of bioactive molecules. Further molecular docking validated that 15 high-affinity molecule-kinase target pairs had docking energy scores below −5 kJ/mol. Among these, 14 pairs exhibited stable conformations during 100 ns molecular dynamics simulations. Notably, Cyanidin chloride, N-Feruloyltyramine, and Imbricatonol were identified as the most promising molecules, demonstrating the high conformational stability when targeting the MAP3K8, CLK4 and FGR kinase targets, respectively. These findings provide a scientific basis for further exploring the medicinal potential of <em>Cyperus esculentus</em>. Overall, the deep learning method used in our study offers new insights into the field of drug discovery related to natural compounds by rapidly and effectively predicting the specific medicinal value components of <em>Cyperus esculentus</em>.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"257 ","pages":"Article 105317"},"PeriodicalIF":3.7,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143156330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Smartphone based app development with machine learning using Hibiscus sabdariffa L. extract for pH estimation 基于智能手机的应用程序开发与机器学习使用芙蓉提取物pH值估计
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-01-02 DOI: 10.1016/j.chemolab.2024.105310
Ömer Faruk Aydın , Merve Aydın , Melisa Caliskan Demir , Sibel Kahraman
{"title":"Smartphone based app development with machine learning using Hibiscus sabdariffa L. extract for pH estimation","authors":"Ömer Faruk Aydın ,&nbsp;Merve Aydın ,&nbsp;Melisa Caliskan Demir ,&nbsp;Sibel Kahraman","doi":"10.1016/j.chemolab.2024.105310","DOIUrl":"10.1016/j.chemolab.2024.105310","url":null,"abstract":"<div><div>This study presents a novel approach for pH estimation in buffer solutions using images of solutions prepared with <em>Hibiscus sabdariffa</em> L. as a natural pH indicator. The images of the solutions, each displaying distinctive colours indicative of their pH levels, were transformed into standardized 200x200-pixel images through the application of image processing techniques. Following this, a pH prediction model was constructed using the Adaptive Boosting regressor algorithm. The pH values of the training data used when training the model were distributed irregularly between 0–14. The models were trained with 94 pictures and 1880 experimental values. In addition, a reliable pre-processing part has been placed into the model using image processing techniques, allowing test data to be obtained in any desired environment. The obtained training and test data were separated from noise parameters, affecting the prediction results negatively. A smartphone application based on the model has been developed and made available to everyone. This innovative methodology bridges the gap between traditional pH measurement techniques and computer vision, offering a more accessible and eco-friendly means of pH assessment. The practical applications of this research extend to various fields, including environmental monitoring, agriculture, and educational settings.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"257 ","pages":"Article 105310"},"PeriodicalIF":3.7,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143156204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UV–Vis spectralprint-based discrimination and quantification of sugar syrup adulteration in honey using the Successive Projections Algorithm (SPA) for variable selection 采用连续投影算法(SPA)进行变量选择的基于UV-Vis光谱打印的蜂蜜中糖浆掺假的鉴别和定量
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-12-26 DOI: 10.1016/j.chemolab.2024.105314
Luana Leal de Souza , Dâmaris Naara Chaves Candeias , Edilene Dantas Telles Moreira , Paulo Henrique Gonçalves Dias Diniz , Valeria Haydée Springer , David Douglas de Sousa Fernandes
{"title":"UV–Vis spectralprint-based discrimination and quantification of sugar syrup adulteration in honey using the Successive Projections Algorithm (SPA) for variable selection","authors":"Luana Leal de Souza ,&nbsp;Dâmaris Naara Chaves Candeias ,&nbsp;Edilene Dantas Telles Moreira ,&nbsp;Paulo Henrique Gonçalves Dias Diniz ,&nbsp;Valeria Haydée Springer ,&nbsp;David Douglas de Sousa Fernandes","doi":"10.1016/j.chemolab.2024.105314","DOIUrl":"10.1016/j.chemolab.2024.105314","url":null,"abstract":"<div><div>This work developed, for the first time, an improved analytical strategy for discriminating and quantifying honey adulteration by adding corn and agave syrups using the Successive Projections Algorithm (SPA) for variable selection in UV–Vis spectral analysis. Sample preparation involved dilution in water alone for obtaining the spectralprint data. By applying the first derivative Savitzky-Golay smoothing to spectra and interval selection by SPA, the iSPA-PLS-DA algorithm (Partial Least Squares - Discriminant Analysis) correctly classified all test samples (i.e., 100 % sensitivity, specificity, and accuracy) selecting 4 out of 15 intervals. Additionally, the quantification of adulteration honey using the iSPA-PLS algorithm achieved the lowest relative error of prediction (REP) and limit of detection (LOD) values of only 5.89 % and 7.02 mg g<sup>−1</sup>, respectively, selecting 10 out of 20 intervals. The proposed method aligns with White and Green Analytical Chemistry principles, being simple, quick, affordable, and eco-friendly. It also aids in developing future protocols and legislation for honey quality.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"257 ","pages":"Article 105314"},"PeriodicalIF":3.7,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143155807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical estimation of mean Lorentzian line width in spectra by Gaussian processes 光谱中洛伦兹平均线宽的高斯过程统计估计
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-12-24 DOI: 10.1016/j.chemolab.2024.105307
Erik Kuitunen , Matthew T. Moores , Teemu Härkönen
{"title":"Statistical estimation of mean Lorentzian line width in spectra by Gaussian processes","authors":"Erik Kuitunen ,&nbsp;Matthew T. Moores ,&nbsp;Teemu Härkönen","doi":"10.1016/j.chemolab.2024.105307","DOIUrl":"10.1016/j.chemolab.2024.105307","url":null,"abstract":"<div><div>We propose a statistical approach for estimating the mean line width in spectra comprising Lorentzian, Gaussian, or Voigt line shapes. Our approach uses Gaussian processes in two stages to jointly model a spectrum and its Fourier transform. We generate statistical samples for the mean line width by drawing realizations for the Fourier transform and its derivative using Markov chain Monte Carlo methods. In addition to being fully automated, our method enables well-calibrated uncertainty quantification of the mean line width estimate through Bayesian inference. We validate our method using a simulation study and apply it to an experimental Raman spectrum of <span><math><mi>β</mi></math></span>-carotene.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"257 ","pages":"Article 105307"},"PeriodicalIF":3.7,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143155808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DD-SIMCA as an alternative tool to assess the short-term stability of a marine sediment reference material candidate DD-SIMCA作为评估候选海洋沉积物参考物质短期稳定性的替代工具
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-12-22 DOI: 10.1016/j.chemolab.2024.105312
Clícia A. Gomes , Carlos José M. da Silva , Maria Tereza W.D. Carneiro , Jefferson R. de Souza , Cibele Maria S. de Almeida
{"title":"DD-SIMCA as an alternative tool to assess the short-term stability of a marine sediment reference material candidate","authors":"Clícia A. Gomes ,&nbsp;Carlos José M. da Silva ,&nbsp;Maria Tereza W.D. Carneiro ,&nbsp;Jefferson R. de Souza ,&nbsp;Cibele Maria S. de Almeida","doi":"10.1016/j.chemolab.2024.105312","DOIUrl":"10.1016/j.chemolab.2024.105312","url":null,"abstract":"<div><div>The stability test is essential in the production of a reference material (RM). They can be classified as short-term (transport conditions) and long-term (shelf time). The evaluation of the stability test is carried out using the regression method, as indicated by ISO Guide 35. However, some studies have highlighted the use of multivariate methods in the evaluation of tests performed in RM production. Therefore, this work presents the data driven soft independent modeling class analogy (DD-SIMCA) method as a viable alternative for evaluating data from the short-term stability test of a candidate reference material for metal determination with marine sediment matrix. The test was performed isochronously for one month at a temperature of 60 °C. The samples were decomposed (in triplicate) by the EPA 3051 A method and analyzed by inductively coupled plasma mass spectrometry (ICP-MS) and inductively coupled plasma optical emission spectrometry (ICP OES). Samples mass fractions stored at standard temperature (−20 °C) were (mg kg<sup>−1</sup>): 70.4 ± 4.5 for Ba, 12.0 ± 0.7 for Co, 17.9 ± 1.0 for Cu, 60.7 ± 3.3 for Zn, 49137 ± 4790 for Al, and 60021 ± 3090 for Fe. These values were compared with the mass fractions of samples subjected to the test condition (60 °C) for four weeks, which were (mg kg-1): 70.0 ± 4.0 for Ba, 12.1 ± 0.5 for Co, 17.4 ± 0.8 for Cu, 60.6 ± 2.9 for Zn, 48388 ± 3424 for Al, and 58049 ± 1886 for Fe. A comparison was made between the mass fractions from the standard and test conditions by the regression method. The model applied in the DD-SIMCA method was constructed using two principal components, an alpha value and confidence interval of 0.05, and the instrumental quintuplicates of the samples stored at −20 °C. The samples subjected to 60 °C fit the constructed model, indicating that there was no significant difference between the properties of these samples and those that were maintained in reference temperature. The RM candidate was considered stable at a temperature of 60 °C for a period of one month, both by the regression method and by the DD-SIMCA method. The multivariate method DD-SIMCA was considered a possible alternative and confirmatory tool in evaluating the results of testing short-term stability realized during RM production.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"257 ","pages":"Article 105312"},"PeriodicalIF":3.7,"publicationDate":"2024-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143156205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing robust prediction models without test datasets: A causal discovery approach on near-infrared spectra 评估没有测试数据集的稳健预测模型:近红外光谱的因果发现方法
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-12-21 DOI: 10.1016/j.chemolab.2024.105313
Minh-Quan Nguyen , Mizuki Tsuta , Mito Kokawa
{"title":"Assessing robust prediction models without test datasets: A causal discovery approach on near-infrared spectra","authors":"Minh-Quan Nguyen ,&nbsp;Mizuki Tsuta ,&nbsp;Mito Kokawa","doi":"10.1016/j.chemolab.2024.105313","DOIUrl":"10.1016/j.chemolab.2024.105313","url":null,"abstract":"<div><div>Machine learning prediction models calibrated with spectral data use correlations between variables without considering causation. The absence of genuine cause–effect relations hinders the ability to ensure methodical prediction reproducibility. Therefore, tools supporting causal-based discovery are essential in spectroscopy and chemometrics to enhance robustness. Accordingly, this study invokes causal inference theory to establish the causal discovery index (CDI) to distinguish datasets with reliable causal structures from those prone to spurious correlations. This framework was applied to seven simulated near-infrared spectral causal structures. Simulated near-infrared spectra were utilized to ensure that the framework performance was optimized and verified appropriately in a generalized methodology. Reliable structures were confirmed to be differentiated by the differences in the mean and standard deviation of bootstrapped CDI indices. Distinctive thresholds for the mean and standard deviation were established at the sample size of 1000 and 10,000. The framework consistently performed well with multiple spectral preprocessing methods such as derivation and dimension reduction. It was also robust with variations, surpassing the conventional test-set validation method without the use of additional independent datasets. This would benefit the applicability of the novel framework in practical situations where dataset collection can be limited. Moreover, it can be extended to various sensor-based data, encompassing only seven possible causal structures.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"257 ","pages":"Article 105313"},"PeriodicalIF":3.7,"publicationDate":"2024-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143156195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信