Chemometrics and Intelligent Laboratory Systems最新文献

筛选
英文 中文
Algae content prediction based on transfer learning and mean impact value 基于迁移学习和平均影响值的藻类含量预测
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-10-11 DOI: 10.1016/j.chemolab.2024.105244
Haonan Zhang, Xiaojing Ping, Haiying Wan, Xiaoli Luan, Fei Liu
{"title":"Algae content prediction based on transfer learning and mean impact value","authors":"Haonan Zhang,&nbsp;Xiaojing Ping,&nbsp;Haiying Wan,&nbsp;Xiaoli Luan,&nbsp;Fei Liu","doi":"10.1016/j.chemolab.2024.105244","DOIUrl":"10.1016/j.chemolab.2024.105244","url":null,"abstract":"<div><div>To improve the prediction accuracies of algae contents in different water bodies, this paper proposes a chlorophyll-A prediction model method based on transfer learning(TL) and mean impact value(MIV) algorithm. Firstly, we preprocess the data collected from the Huai River, including removing the missing data and standardizing the preserved data. Then, the MIV algorithm is used to reduce the dimensionality of the data and determine the input variables of the model. Based on the selected input variables, the TL algorithm is introduced to establish the chlorophyll-A prediction model. The developed method can effectively enhance the prediction accuracy, especially when the number of samples is small. The simulation results verify the effectiveness and feasibility of the proposed prediction model.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105244"},"PeriodicalIF":3.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142526139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recent applications of analytical quality-by-design methodology for chromatographic analysis: A review 色谱分析中分析质量控制方法的最新应用:综述
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-10-10 DOI: 10.1016/j.chemolab.2024.105243
Doan Thanh Xuan , Hue Minh Thi Nguyen , Vu Dang Hoang
{"title":"Recent applications of analytical quality-by-design methodology for chromatographic analysis: A review","authors":"Doan Thanh Xuan ,&nbsp;Hue Minh Thi Nguyen ,&nbsp;Vu Dang Hoang","doi":"10.1016/j.chemolab.2024.105243","DOIUrl":"10.1016/j.chemolab.2024.105243","url":null,"abstract":"<div><div>Analytical Quality-by-Design (AQbD) represents a systematic methodology for method development. The pharmaceutical and biopharmaceutical industries have increasingly recognized and applied AQbD concepts, guided by the overall framework provided by ICH. AQbD is established to ensure that an analytical procedure is fit for its intended purpose throughout its entire lifecycle, leading to a well-understood and purpose-driven method. It guides each stage of the analytical process lifecycle by establishing the Analytical Target Profile (ATP), identifying critical method parameters (CMPs), and selecting critical method attributes (CMAs). By employing screening and response-surface experimental designs, significant factors are pinpointed and optimized through statistical analysis. This methodology aids in defining the design space or Method Operable Design Region (MODR) to ensure consistent method performance. This review delves into the foundational principles of AQbD for method development and presents its latest applications in the period 2019–2024 with reference to chromatographic analysis of both non-synthetic and synthetic compounds in different sample matrices. The implementation of AQbD proved to generate more robust chromatographic methods, enhancing their efficiency in the process. Nevertheless, its adoption can be hindered owing to the necessity for a comprehensive grasp of statistical analysis and experimental design, coupled with the absence of standardized directives or regulatory prerequisites.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105243"},"PeriodicalIF":3.7,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142441356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Layer-wise-residual-driven approach for soft sensing in composite dynamic system based on slow and fast time-varying latent variables 基于慢速和快速时变潜变量的复合动态系统软传感分层-残差驱动方法
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-10-09 DOI: 10.1016/j.chemolab.2024.105245
Zhengxuan Zhang , Xu Yang , Jian Huang , Yuri A.W. Shardt
{"title":"Layer-wise-residual-driven approach for soft sensing in composite dynamic system based on slow and fast time-varying latent variables","authors":"Zhengxuan Zhang ,&nbsp;Xu Yang ,&nbsp;Jian Huang ,&nbsp;Yuri A.W. Shardt","doi":"10.1016/j.chemolab.2024.105245","DOIUrl":"10.1016/j.chemolab.2024.105245","url":null,"abstract":"<div><div>Driven by the requirements for a comprehensive understanding of composite dynamic systems in industrial processes, this paper investigates a new soft sensor for quality prediction based on slow and fast time-varying latent variables extraction using layer-wise residuals. First, the slow feature partial least squares were expanded into long-term dependency by introducing explicit expressions of the potential state of the process into the objective function. Then, the multilayer regression model for exploring composite dynamics driven by layer-wise residuals is developed using a serial structure that can extract both slow and fast time-varying latent variables that are completely orthogonal. Finally, the exponential-weighted partial least squares are proposed for extracting fast time-varying dynamic latent variables by learning the exponential decay properties of the time-series data correlation. Case studies on the industrial debutanizer and sulfur recovery unit show that the prediction accuracy of the proposed approach outperforms traditional methods.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105245"},"PeriodicalIF":3.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Applicability domain of a calibration model based on neural networks and infrared spectroscopy 基于神经网络和红外光谱的校准模型的适用范围
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-10-05 DOI: 10.1016/j.chemolab.2024.105242
M. Suliany Rodríguez-Barrios , Joan Ferré , M. Soledad Larrechi , Enric Ruiz
{"title":"Applicability domain of a calibration model based on neural networks and infrared spectroscopy","authors":"M. Suliany Rodríguez-Barrios ,&nbsp;Joan Ferré ,&nbsp;M. Soledad Larrechi ,&nbsp;Enric Ruiz","doi":"10.1016/j.chemolab.2024.105242","DOIUrl":"10.1016/j.chemolab.2024.105242","url":null,"abstract":"<div><div>Artificial neural networks are used as calibration models in routine analytical determinations that involve spectroscopic data. To ensure that the model will generate reliable predictions for new samples, the applicability domain must be well defined. This article describes a strategy for establishing the limits of the applicability domain when the calibration model is a feed-forward neural network. The applicability domain was defined by two limits: 1) the 0.99 quantile of the squared Mahalanobis distance calculated from the network activations of the training set and 2) the 0.99 quantile of the reconstruction error of the training spectra using either an autoencoder network or a decoder network. A new sample with a squared Mahalanobis distance and/or spectral residuals beyond these limits is said to be outside the applicability domain, and the prediction is questionable. The approach was illustrated by predicting the density of diesel fuel samples from mid-infrared spectra and the fat content in meat from near-infrared spectra. The methodology could correctly detect anomalous spectra in prediction using either the autoencoder or the decoder.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105242"},"PeriodicalIF":3.7,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142441357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning based modeling for estimation of drug solubility in supercritical fluid by adjusting important parameters 基于机器学习的模型,通过调整重要参数估算药物在超临界流体中的溶解度
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-10-03 DOI: 10.1016/j.chemolab.2024.105241
Yaoyang Liu , Morug Salih Mahdi , Usama Kadem Radi , Ali Jihad , Ali Hamid AbdulHussein , Irshad Ahmad , Nasrin Mansuri , Mostafa Adnan Abdalrahman , Ahmed Alkhayyat , Ahmed Faisal
{"title":"Machine learning based modeling for estimation of drug solubility in supercritical fluid by adjusting important parameters","authors":"Yaoyang Liu ,&nbsp;Morug Salih Mahdi ,&nbsp;Usama Kadem Radi ,&nbsp;Ali Jihad ,&nbsp;Ali Hamid AbdulHussein ,&nbsp;Irshad Ahmad ,&nbsp;Nasrin Mansuri ,&nbsp;Mostafa Adnan Abdalrahman ,&nbsp;Ahmed Alkhayyat ,&nbsp;Ahmed Faisal","doi":"10.1016/j.chemolab.2024.105241","DOIUrl":"10.1016/j.chemolab.2024.105241","url":null,"abstract":"<div><div>Here, we employed machine learning models to predict how well Capecitabine drug would dissolve in supercritical carbon dioxide as the green solvent. The vision is to investigate the drug suitability for processing of nanodrugs with enhanced bioavailability in the body. In the employed data set, P (pressure) and T (temperature) serve as inputs, and Y, the solubility, is the only output for building the models. This study uses DT (Decision Tree) and MLP (Multilayer perceptron) as the core models. However, the raw and individual form of conventional algorithms may not provide accurate and general results. Ensemble methods like boosting improve the model performance. Also, single and ensemble models mounted on these models have hyper-parameters whose optimization affects the final models. Meta-heuristic algorithms are popular for tuning hyper-parameters. In this research, we used a new hybrid framework by coupling the basic models with the Adaboost algorithm (as an ensemble method) and PO and CS algorithms (as optimizers) to obtain four different models. The MLP model boosted with Adaboost and tuned with PO algorithm showed the best fitting accuracy among all models. This model reduces the RMSE error rate to 1.71, MSE to 2.92, and MAE to 1.42.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105241"},"PeriodicalIF":3.7,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142421459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Benchmarking multiblock methods with canonical factorization 用典型因式分解对多块方法进行基准测试
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-10-02 DOI: 10.1016/j.chemolab.2024.105240
Stéphanie Bougeard , Caroline Peltier , Benoit Jaillais , Jean-Claude Boulet , Mohamed Hanafi
{"title":"Benchmarking multiblock methods with canonical factorization","authors":"Stéphanie Bougeard ,&nbsp;Caroline Peltier ,&nbsp;Benoit Jaillais ,&nbsp;Jean-Claude Boulet ,&nbsp;Mohamed Hanafi","doi":"10.1016/j.chemolab.2024.105240","DOIUrl":"10.1016/j.chemolab.2024.105240","url":null,"abstract":"<div><div>Data measured on the same observations and organized in blocks of variables — from different measurement sources or deduced from topics specified by the user — are common in practice. Multiblock exploratory methods are useful tools to extract information from data in a reduced and interpretable common space. However, many methods have been proposed independently and the users are often lost in selecting the appropriate one, especially as they do not always lead to the same results or because outputs do not have the same form. For this purpose, the data decomposition by canonical factorization was introduced thus applied to some widely-used methods, CPCA, MCOA, MFA, STATIS and CCSWA. The methods were compared on simulated (resp. real) data whose structure is controlled (resp. known). Theoretical and practical results pinpoint that the block-structure must be carefully explored beforehand. The number of block-variables and the block-variance distribution along dimensions impacts the choice of the block-scaling. The observation-structure within and between blocks impacts the choice of the method. CPCA or MCOA mix common and specific information, STATIS highlights common structure only whereas CCSWA focuses on specific information. To enable these diagnoses, methods and proposed comparison tools are available on <span>R</span>, <span>Matlab</span> or <span>Galaxy</span>.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105240"},"PeriodicalIF":3.7,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142421521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KF-PLS: Optimizing Kernel Partial Least-Squares (K-PLS) with Kernel Flows KF-PLS:利用内核流量优化内核部分最小二乘法(K-PLS)
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-10-01 DOI: 10.1016/j.chemolab.2024.105238
Zina-Sabrina Duma , Jouni Susiluoto , Otto Lamminpää , Tuomas Sihvonen , Satu-Pia Reinikainen , Heikki Haario
{"title":"KF-PLS: Optimizing Kernel Partial Least-Squares (K-PLS) with Kernel Flows","authors":"Zina-Sabrina Duma ,&nbsp;Jouni Susiluoto ,&nbsp;Otto Lamminpää ,&nbsp;Tuomas Sihvonen ,&nbsp;Satu-Pia Reinikainen ,&nbsp;Heikki Haario","doi":"10.1016/j.chemolab.2024.105238","DOIUrl":"10.1016/j.chemolab.2024.105238","url":null,"abstract":"<div><div>Partial Least-Squares (PLS) regression is a widely used tool in chemometrics for performing multivariate regression. As PLS has a limited capacity of modelling non-linear relations between the predictor variables and the response, Kernel PLS (K-PLS) has been introduced for modelling non-linear predictor-response relations. Most available studies use fixed kernel parameters, reducing the performance potential of the method. Only a few studies have been conducted on optimizing the kernel parameters for K-PLS. In this article, we propose a methodology for the kernel function optimization based on Kernel Flows (KF), a technique developed for Gaussian Process Regression (GPR). The results are illustrated with four case studies. The case studies represent both numerical examples and real data used in classification and regression tasks. K-PLS optimized with KF, called KF-PLS in this study, is shown to yield good results in all illustrated scenarios, outperforming literature results and other non-linear regression methodologies. In the present study, KF-PLS has been compared to convolutional neural networks (CNN), random trees, ensemble methods, support vector machines (SVM), and GPR, and it has proved to perform very well.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105238"},"PeriodicalIF":3.7,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142421458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AIPs-DeepEnC-GA: Predicting anti-inflammatory peptides using embedded evolutionary and sequential feature integration with genetic algorithm based deep ensemble model AIPs-DeepEnC-GA:利用基于遗传算法的嵌入式进化和序列特征集成深度集合模型预测抗炎肽
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-09-29 DOI: 10.1016/j.chemolab.2024.105239
Ali Raza , Jamal Uddin , Quan Zou , Shahid Akbar , Wajdi Alghamdi , Ruijun Liu
{"title":"AIPs-DeepEnC-GA: Predicting anti-inflammatory peptides using embedded evolutionary and sequential feature integration with genetic algorithm based deep ensemble model","authors":"Ali Raza ,&nbsp;Jamal Uddin ,&nbsp;Quan Zou ,&nbsp;Shahid Akbar ,&nbsp;Wajdi Alghamdi ,&nbsp;Ruijun Liu","doi":"10.1016/j.chemolab.2024.105239","DOIUrl":"10.1016/j.chemolab.2024.105239","url":null,"abstract":"<div><div>Inflammation is a biological response to harmful stimuli including infections, damaged cells, tissue injuries, and toxic chemicals. It plays an essential role in facilitating tissue repair by eliminating pathogenic microorganisms. Currently, numerous therapies are applied to treat autoimmune and inflammatory diseases. However, these conventional anti-inflammatory medications are often labor-intensive, costly, and associated with adverse side effects. Recently, researchers have identified anti-inflammatory peptides (AIPs) as a cost-effective alternative for treating several inflammatory diseases, due to their high selectivity for target cells with minimal side effects. In this study, we introduce a novel computational predictor, AIPs-DeepEnC-GA, developed to accurately predict AIP samples. The training sequences are encoded using a novel n-spaced dipeptide-based position-specific scoring matrix (NsDP-PSSM) and Pseudo position-specific scoring matrix (PsePSSM)-based embedded evolutionary features. Additionally, the reduced-amino acid alphabet (RAAA-11), and composite Physiochemical properties (CPP) are employed to capture cluster-physiochemical properties based on structural information. A hybrid feature strategy is then applied, integrating embedded evolutionary features, CPP and RAAA-11 descriptors to overcome the limitations of individual encoding methods. Minimum redundancy and maximum relevance (mRMR) is utilized to select the optimal features. The selected features are trained using four different deep-learning models. The predictive labels generated by these models are provided to a genetic algorithm to form a deep-ensemble training model. The proposed AIPs-DeepEnC-GA model achieved a ∼15 % increase in predictive accuracy, reaching 94.39 %, and a 19 % improvement in the area under the curve (AUC), achieving a value of 0.98 using training sequences. For independent datasets, our method obtained improved accuracies of 91.87 %, and 89.21 %, with AUC values of 0.94 and 0.92 for Ind-I, and Ind-II, respectively. Our proposed AIPs-DeepEnC-GA model demonstrates an 11 % improvement in predictive accuracy over existing AIPs computational models using training samples. The efficacy and reliability of this model make it a promising tool for both in drug development and research academia.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105239"},"PeriodicalIF":3.7,"publicationDate":"2024-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142421460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An automated Peak Group Analysis for vibrational spectra analysis 用于振动光谱分析的自动峰群分析仪
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-09-23 DOI: 10.1016/j.chemolab.2024.105234
Mathias Sawall , Christoph Kubis , Benedict N. Leidecker , Lukas Prestin , Tomass Andersons , Martina Beese , Jan Hellwig , Robert Franke , Armin Börner , Klaus Neymeyr
{"title":"An automated Peak Group Analysis for vibrational spectra analysis","authors":"Mathias Sawall ,&nbsp;Christoph Kubis ,&nbsp;Benedict N. Leidecker ,&nbsp;Lukas Prestin ,&nbsp;Tomass Andersons ,&nbsp;Martina Beese ,&nbsp;Jan Hellwig ,&nbsp;Robert Franke ,&nbsp;Armin Börner ,&nbsp;Klaus Neymeyr","doi":"10.1016/j.chemolab.2024.105234","DOIUrl":"10.1016/j.chemolab.2024.105234","url":null,"abstract":"<div><div>Peak Group Analysis (PGA) is a multivariate curve resolution technique that attempts to extract single pure component spectra from time series of spectral mixture data. It requires that the mixture spectra consist of relatively sharp peaks, as is typical in IR and Raman spectroscopy. PGA aims to construct from individual peaks the associated pure component spectra in the form of nonnegative linear combinations of the right singular vectors of the spectral data matrix.</div><div>This work presents an automated PGA (autoPGA) that starts with upstream peak detection applied to time series of spectra, combining different window-based peak detection techniques with balanced peak acceptance criteria and peak grouping to deal with repeated detections. The next step is a single-spectrum-oriented PGA analysis. This is followed by a downstream correlation analysis to identify pure component spectra that occur multiple times. AutoPGA provides a complete pure component factorization of the matrix of measured data. The algorithm is applied to FT-IR data sets on various rhodium carbonyl complexes and from an equilibrium of iridium complexes.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105234"},"PeriodicalIF":3.7,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142421519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data pre-processing for paper-based colorimetric sensor arrays 纸质比色传感器阵列的数据预处理
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-09-21 DOI: 10.1016/j.chemolab.2024.105237
Bahram Hemmateenejad , Knut Baumann
{"title":"Data pre-processing for paper-based colorimetric sensor arrays","authors":"Bahram Hemmateenejad ,&nbsp;Knut Baumann","doi":"10.1016/j.chemolab.2024.105237","DOIUrl":"10.1016/j.chemolab.2024.105237","url":null,"abstract":"<div><div>The responses of the paper-based colorimetric sensor arrays are typically recorded by an imaging device. The color values of the images are subjected to chemometrics data analysis, with a view to extract the relevant information. As is the case with data extracted from other analytical instruments, these data must undergo pre-processing prior to undergoing further analysis. This study represents the first comprehensive and systematic investigation into the impact of data pre-processing techniques on the quality of subsequent data analysis methods applied to imaging data collected from paper-based colorimetric sensor arrays. The use of color difference data (calculated by subtracting the images of the sensors before exposure from those after exposure) revealed that pre-treatment of the data was not a critical factor, although it could reduce the complexity of the model. For example, the number of principal components in the principal component-linear discriminant analysis model was reduced from eight (for data that had not been pre-processed) to three (for pre-processed data) to achieve the same level of accuracy (92 %). Nevertheless, the pivotal role of data pre-processing was elucidated through the examination of data sets collected immediately following exposure to the samples’ vapor. It was demonstrated that the use of an appropriate pre-processing method allows for the elimination or significant reduction of between-sensor variations, obviating the necessity for the inclusion of data from images taken prior to exposure. With regard to the objective of classification, the object pre-processing methods that demonstrated particular promise were mean (or median) centering, Pareto scaling and standard normal variate. To illustrate, in the analysis of volatile organic compounds by an array of metallic nanoparticles, the cross-validation classification accuracy of the unprocessed data, which was 70 %, increased to 95 % when unit variance scaling and range scaling were applied to objects and variables, respectively. In the calibration phase, the majority of pre-processing methods enhanced the quality of the regression models. Using suitable pre-processing methods for both objects and variables, eliminated the need for using the before exposing image of the CSAs.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105237"},"PeriodicalIF":3.7,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142314962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信