Chemometrics and Intelligent Laboratory Systems最新文献

筛选
英文 中文
Comparison of colour and texture feature extraction methods to predict anthocyanins content in Sangiovese grapes 桑娇维塞葡萄花色苷含量预测的颜色和质地特征提取方法比较
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-05-27 DOI: 10.1016/j.chemolab.2025.105446
Camilla Menozzi , José Manuel Prats-Montalbán , Rosalba Calvini , Alessandro Ulrici
{"title":"Comparison of colour and texture feature extraction methods to predict anthocyanins content in Sangiovese grapes","authors":"Camilla Menozzi ,&nbsp;José Manuel Prats-Montalbán ,&nbsp;Rosalba Calvini ,&nbsp;Alessandro Ulrici","doi":"10.1016/j.chemolab.2025.105446","DOIUrl":"10.1016/j.chemolab.2025.105446","url":null,"abstract":"<div><div>Colour and texture are the two main sources of information contained in RGB images of food products. Different image-level approaches are available to analyse the image properties based on the extraction of colour and texture features, and the selection of the most appropriate method is a critical point, since it could significantly impact the outcomes. The present study has three main objectives. Firstly, we propose an innovative data dimensionality reduction method to extract and codify the texture features of an RGB image into a one-dimensional signal, named texturegram (TXG). Then, TXG approach is compared with different image-level feature extraction methods, such as colourgrams (CLG), Soft Colour Texture Descriptors (SCTD) and Grey Level Co-occurrence Matrices (GLCM). These techniques were used to analyse a benchmark dataset of RGB images already considered in a previous study to build Partial Least Squares (PLS) models and relate the image features with anthocyanins content of red grape samples. We also investigated the possible advantages of combining the colour and texture information brought by the different image-level techniques using data fusion. PLS models were calculated considering different partitions of the RGB image dataset into training and test set. The performances of the different models were statistically evaluated by means of Analysis of Variance (ANOVA) and Principal Component Analysis (PCA). Overall, the results suggested an interesting, even if slight, improvement of the model performances when fusing CLG and TXG, but also highlighted the hybrid nature of TXG to simultaneously explore colour and texture properties.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105446"},"PeriodicalIF":3.7,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Applications of artificial intelligence and machine learning in combination with surface-enhanced Raman spectroscopy (SERS) 人工智能和机器学习与表面增强拉曼光谱(SERS)结合的应用
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-05-27 DOI: 10.1016/j.chemolab.2025.105445
Hashim Jabbar , Inass Abdulah Zgair , Kamran Heydaryan , Shaymaa Awad Kadhim , Saeideh Mehmandoust , Vahid Eskandari , Hossein Sahbafar
{"title":"Applications of artificial intelligence and machine learning in combination with surface-enhanced Raman spectroscopy (SERS)","authors":"Hashim Jabbar ,&nbsp;Inass Abdulah Zgair ,&nbsp;Kamran Heydaryan ,&nbsp;Shaymaa Awad Kadhim ,&nbsp;Saeideh Mehmandoust ,&nbsp;Vahid Eskandari ,&nbsp;Hossein Sahbafar","doi":"10.1016/j.chemolab.2025.105445","DOIUrl":"10.1016/j.chemolab.2025.105445","url":null,"abstract":"<div><div>Surface-enhanced Raman spectroscopy (SERS) offers exceptional sensitivity for identifying and detecting a wide range of compounds by greatly enhancing Raman signals from molecules on metal surfaces. SERS application has been further transformed by the integration of artificial intelligence (AI) and machine learning (ML), which automates spectrum interpretation, enhances identification accuracy, and optimizes experimental settings. This paper examines current developments in the synergistic use of AI and ML with SERS in a variety of disciplines, including environmental monitoring, food safety, pathogen detection, and disease diagnosis. Studies that have been published show that these models can distinguish between analytes such as bacteria, viruses, cancer cells, and chemical substances with above 95 % accuracy. The promise of these methods is shown by the fact that some research even showed 100 % accuracy in sample identification. Food safety, environmental monitoring, and clinical diagnostics might all be revolutionized by SERS-ML techniques because of their great sensitivity, specificity, and reliability. Future research should focus on extending clinical applications, enhancing substrate capabilities and detection limitations, incorporating sophisticated machine learning techniques, and increasing the application broadness. In order to improve the robustness and practicality of these methodologies, further validation in larger cohorts and real-world contexts is also emphasized. The study demonstrates how combining AI/ML with SERS offers the potential to fundamentally change the fields of materials research, environmental monitoring, diagnostics, and other related fields.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105445"},"PeriodicalIF":3.7,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144185730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SMILES-driven machine learning for high-throughput investigation of anti-corrosion materials smile驱动的机器学习用于防腐材料的高通量研究
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-05-25 DOI: 10.1016/j.chemolab.2025.105441
Muhamad Akrom , Harun Al Azies , Wise Herowati , Totok Sutojo , Supriadi Rustad , Hermawan Kresno Dipojono , Hideaki Kasai
{"title":"SMILES-driven machine learning for high-throughput investigation of anti-corrosion materials","authors":"Muhamad Akrom ,&nbsp;Harun Al Azies ,&nbsp;Wise Herowati ,&nbsp;Totok Sutojo ,&nbsp;Supriadi Rustad ,&nbsp;Hermawan Kresno Dipojono ,&nbsp;Hideaki Kasai","doi":"10.1016/j.chemolab.2025.105441","DOIUrl":"10.1016/j.chemolab.2025.105441","url":null,"abstract":"<div><div>This investigation delves into the viability of the simplified molecular input line entry system (SMILES)-based machine learning (ML) approach as the sole input feature for predicting the corrosion inhibition efficiency (CIE) of pyridine-quinoline compounds to replace various quantum chemical properties (QCP). Employing the molecular access system (MACCS) fingerprint techniques simplifies the processing of molecular structures, enhancing data efficiency. The ML algorithm, notably the gradient boosting (GB) model, showcases superior predictive capabilities, as evidenced by R<sup>2</sup> and RMSE values of 0.92 and 0.07, respectively. This outcome is akin to predictions employing 20 QCP features, yielding R<sup>2</sup> and RMSE values of 0.90 and 0.08, respectively. The study substantiates SMILES as a robust single feature for accurate CIE prediction, revealing a moderate correlation between SMILES-represented structures and CIE values. This underscores the effectiveness of SMILES-based ML in assessing corrosion inhibition potential, thereby advancing predictive modeling in corrosion science. Integrating machine learning and SMILES notation presents an efficient approach for evaluating the corrosion inhibition capacity of diverse molecular structures.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105441"},"PeriodicalIF":3.7,"publicationDate":"2025-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144137760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing dehydrated soups with advanced algorithms: A novel application of D-Optimal mixture design and NSGA-II for healthier formulations 用先进的算法优化脱水汤:D-Optimal混合设计和NSGA-II的新应用,更健康的配方
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-05-24 DOI: 10.1016/j.chemolab.2025.105443
Hamza Moussa , Farid Dahmoune , Yassine Noui , Amal Mameri , Hichem Tahraoui , Salma Menasria , Souhila Abbas , Sarah Hamid , Nourelimane Benzitoune , Abdeltif Amrane
{"title":"Optimizing dehydrated soups with advanced algorithms: A novel application of D-Optimal mixture design and NSGA-II for healthier formulations","authors":"Hamza Moussa ,&nbsp;Farid Dahmoune ,&nbsp;Yassine Noui ,&nbsp;Amal Mameri ,&nbsp;Hichem Tahraoui ,&nbsp;Salma Menasria ,&nbsp;Souhila Abbas ,&nbsp;Sarah Hamid ,&nbsp;Nourelimane Benzitoune ,&nbsp;Abdeltif Amrane","doi":"10.1016/j.chemolab.2025.105443","DOIUrl":"10.1016/j.chemolab.2025.105443","url":null,"abstract":"<div><div>This study explores the optimization of dehydrated soup formulations to enhance their total phenolic content (TPC) and total flavonoid content (TFC) using a D-optimal mixture design and the NSGA-II (Non-dominated Sorting Genetic Algorithm II). To study the effect of the selected ingredients, A D-optimal mixture design with 15 experimental runs was employed to study the effect of selected ingredients, highlighting their significance on TPC and TFC. The NSGA-II algorithm generated 19 optimal solutions for maximizing TPC and TFC. Experimental validation of the best solution predicted TPC of 8.15 mg GAE/g dw and TFC of 1.30 mg <sub>QE</sub>/g <sub>dw</sub>, with experimental values of 10.29 mg <sub>GAE</sub>/g dw and 1.38 mg <sub>QE</sub>/g <sub>dw</sub>, confirming the model's accuracy. The optimized soup, comprised 47.70 % broccoli, 20 % celery, 20 % onion, 10 % vegetable mix, and 2.29 % salt. It exhibited significantly higher TPC and better antioxidant activity, as measured by DPPH<sup>•</sup> and ABTS<sup>•+</sup> assays. Molecular docking identified bioactive compounds with strong binding affinities to KEAP1 and BCL2 proteins, suggesting potential therapeutic applications for oxidative stress. A MATLAB interface was developed to facilitate practical application in the food industry, demonstrating the effective use of optimization techniques to create high-quality, enriched dehydrated soups and providing a foundation for future research in food formulation.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105443"},"PeriodicalIF":3.7,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep matrix factorization considering dynamic constraints to complete missing data of complex industrial processes 考虑动态约束的复杂工业过程缺失数据补全的深度矩阵分解
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-05-23 DOI: 10.1016/j.chemolab.2025.105433
Zhiyi Ji, Chunhua Yang, Jingxiu He, Yonggang Li, Dong Li
{"title":"Deep matrix factorization considering dynamic constraints to complete missing data of complex industrial processes","authors":"Zhiyi Ji,&nbsp;Chunhua Yang,&nbsp;Jingxiu He,&nbsp;Yonggang Li,&nbsp;Dong Li","doi":"10.1016/j.chemolab.2025.105433","DOIUrl":"10.1016/j.chemolab.2025.105433","url":null,"abstract":"<div><div>In the complex industrial processes, data loss is an unavoidable issue. Due to the lengthy process flow and complex reaction mechanisms, traditional data completion methods fail to deliver satisfactory results when data loss occurs. To address this challenge, this paper proposes deep matrix factorization considering dynamic constraints (DMFDC). This algorithm combines traditional matrix factorization with artificial neural networks, leveraging the strengths of neural networks to approximate nonlinear mappings in latent variable models and utilizing all available information to minimize discrepancies between raw and generated data. Additionally, DMFDC accounts for the dynamic characteristics of the complex industrial system, employing differential operations to transform irregularly changing industrial data into a more stable sequence, thereby enabling the model to better capture data evolution patterns. This approach allows DMFDC to intelligently address the issue of missing dynamic data in the complex industrial process and to predict missing values more accurately. To evaluate its effectiveness, we conducted case studies under various missing data conditions based on a digestion dataset collected from actual alumina production sites. The results indicate that DMFDC achieves higher data completion accuracy than other methods, confirming the applicability of our approach in diverse situations involving missing data.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105433"},"PeriodicalIF":3.7,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144131160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An unsupervised domain adaptation regression method in kernel partial least squares subspace embedded with joint statistical and manifold alignment for Fourier-transform infrared spectroscopy in agri-food analysis 农业食品分析中傅里叶变换红外光谱的联合统计和流形对准核偏最小二乘子空间无监督自适应回归方法
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-05-23 DOI: 10.1016/j.chemolab.2025.105442
Peng Shan , Ruige Yang , Teng Liang , Lin Zhang , Yuliang Zhao , Zhonghai He , Silong Peng
{"title":"An unsupervised domain adaptation regression method in kernel partial least squares subspace embedded with joint statistical and manifold alignment for Fourier-transform infrared spectroscopy in agri-food analysis","authors":"Peng Shan ,&nbsp;Ruige Yang ,&nbsp;Teng Liang ,&nbsp;Lin Zhang ,&nbsp;Yuliang Zhao ,&nbsp;Zhonghai He ,&nbsp;Silong Peng","doi":"10.1016/j.chemolab.2025.105442","DOIUrl":"10.1016/j.chemolab.2025.105442","url":null,"abstract":"<div><div>Within the agri-food sector, the precise measurement of essential ingredients in samples across different measurement contexts using Fourier Transform Infrared spectroscopy (FTIR) is crucial, underscoring the need for advanced calibration methods with extensive generalizability. Domain adaptation (DA) in machine learning is a pivotal area of research focused on training models to be adaptable to both source and target domains with differing data distributions. This paper delves into the application of unsupervised domain adaptation (UDA) for FTIR analysis in agri-food products, utilizing unlabeled data from the target domain to address the challenge of limited reference samples. To realize complex nonlinear adaptation, combining the advantages of statistical alignment and nonlinear ability from domain-invariant iterative partial least squares (DIPALS) and kernel domain adaptive partial least squares (da-PLS) respectively, a novel UDA regression method in kernel partial least squares subspace embedded with joint statistical and manifold alignment (JSMKPLS) is present by further integrating a manifold alignment strategy that could incorporate geometric nonlinear structure into the adaptation process. The framework simultaneously exploits the statistical and geometrical properties in reproducing kernel Hilbert space (RKHS) and extract the domain invariant features. Experimental results of corn, rice, γ-PGA fermentation and wheat datasets confirm the effectiveness of JSMKPLS for FTIR analysis.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105442"},"PeriodicalIF":3.7,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144178170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Confidence interval on the kinetic parameters of simple condensed phase reactions 简单凝聚相反应动力学参数的置信区间
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-05-22 DOI: 10.1016/j.chemolab.2025.105434
Alireza Aghili , Amir Hossein Haghighi , Amir Hossein Shabani
{"title":"Confidence interval on the kinetic parameters of simple condensed phase reactions","authors":"Alireza Aghili ,&nbsp;Amir Hossein Haghighi ,&nbsp;Amir Hossein Shabani","doi":"10.1016/j.chemolab.2025.105434","DOIUrl":"10.1016/j.chemolab.2025.105434","url":null,"abstract":"<div><div>Confidence intervals play a crucial role in statistical inference, as they provide a range of values within which a population parameter is likely to fall, thereby enabling researchers to quantify the uncertainty associated with their estimates. This study proposes a new approach for estimating the confidence intervals on kinetic parameters of simple condensed phase reactions using a combined kinetic analysis and multiple linear regression. The conversion function may be represented in the form of truncated Šesták-Berggren (TSB), Šesták-Berggren (SB), or discrete cosine transform (DCT) models. The confidence intervals are calculated for pre-exponential factor, activation energy, and reaction exponents directly from multiple linear regression. However, for rate constant and conversion function, we need to estimate the variance of these parameters using the delta method. The proposed method was applied to the kinetic data from a simulated reaction as well as those of thermal decomposition of a commercial poly(methyl methacrylate). The results revealed that the DCT model provides highly accurate estimates with extremely narrow confidence intervals for kinetic parameters of the simulated reaction, whereas the TSB and SB models may exhibit systematic errors. The research also includes GNU Octave/MATLAB codes enabling readers to generate smooth reaction rate curves from noisy experimental data using the Fourier cosine series expansion and discrete cosine transform, approximate conversion functions with TSB, SB, and DCT models, and determine kinetic parameters and their confidence intervals for simple reactions through the new combined kinetic analysis methods.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105434"},"PeriodicalIF":3.7,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144134452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pattern modeling and fault detection based on dynamic controlled autoencoder 基于动态控制自编码器的模式建模与故障检测
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-05-19 DOI: 10.1016/j.chemolab.2025.105422
Wei Guo, Xiaoli Luan, Fei Liu
{"title":"Pattern modeling and fault detection based on dynamic controlled autoencoder","authors":"Wei Guo,&nbsp;Xiaoli Luan,&nbsp;Fei Liu","doi":"10.1016/j.chemolab.2025.105422","DOIUrl":"10.1016/j.chemolab.2025.105422","url":null,"abstract":"<div><div>Industrial processes often exhibit significant nonlinear and dynamic characteristics. To effectively monitor these processes, this paper proposes a dynamic controlled autoencoder (DCAE) model for pattern extraction, which primarily consists of an autoencoder and dynamic mapping components. It is capable of simultaneously extracting the nonlinear structural relationships of process variables in static space and their nonlinear dynamics in the time domain, and in particular, establishing the dynamic causality between control input and pattern. The dynamic controlled pattern extracted using DCAE can sufficiently represent the operation information of the nonlinear process. Then, the relationships between DCAE modeling errors and model variables are explored, leading to the construction of error statistics for monitoring industrial processes and the development of a DCAE-based fault detection scheme. Finally, the case study of an industrial boiler combustion system illustrates the effectiveness and superiority of the DCAE model in extracting the pattern of industrial processes and performing fault detection.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105422"},"PeriodicalIF":3.7,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144194544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Kolmogorov–Arnold neural network for identification of functional groups from FTIR spectra 从FTIR光谱中识别官能团的Kolmogorov-Arnold神经网络
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-05-17 DOI: 10.1016/j.chemolab.2025.105421
Tomasz Urbańczyk , Jakub Bożek , Szymon Mirczak , Jarosław Koperski , Marek Krośnicki
{"title":"Kolmogorov–Arnold neural network for identification of functional groups from FTIR spectra","authors":"Tomasz Urbańczyk ,&nbsp;Jakub Bożek ,&nbsp;Szymon Mirczak ,&nbsp;Jarosław Koperski ,&nbsp;Marek Krośnicki","doi":"10.1016/j.chemolab.2025.105421","DOIUrl":"10.1016/j.chemolab.2025.105421","url":null,"abstract":"<div><div>New architecture of a deep neural network for identification of functional groups of molecules based on FTIR spectra is presented. The architecture employs the innovative Kolmogorov–Arnold layers. Instead of a single weight, each input in neurons belonging to these layers, possesses an independent learnable activation function. The article analyzes the quality of the neural network prediction for convolutional network containing Kolmogorov–Arnold layers in comparison with a classic convolutional neural network for 22 functional groups. The obtained results are compared with the results available from other studies.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105421"},"PeriodicalIF":3.7,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144106504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging PLS and Lasso in MARS for high-dimensional FTIR data: A hybrid proposed model for antidiabetic activity of schiff base compounds 利用PLS和Lasso在MARS高维FTIR数据:希夫碱化合物抗糖尿病活性的混合提议模型
IF 3.7 2区 化学
Chemometrics and Intelligent Laboratory Systems Pub Date : 2025-05-16 DOI: 10.1016/j.chemolab.2025.105418
Sughra Sarwar, Tahir Mehmood, Muhammad Arfan
{"title":"Leveraging PLS and Lasso in MARS for high-dimensional FTIR data: A hybrid proposed model for antidiabetic activity of schiff base compounds","authors":"Sughra Sarwar,&nbsp;Tahir Mehmood,&nbsp;Muhammad Arfan","doi":"10.1016/j.chemolab.2025.105418","DOIUrl":"10.1016/j.chemolab.2025.105418","url":null,"abstract":"<div><div>In this study, we utilized Fourier Transform Infrared (FTIR) spectral data to create and analyze multiple regression models to predict the anti-diabetic potential of synthesized Schiff bases. Schiff bases are a wide range of compounds characterized by a double bond between the nitrogen and carbon atoms. Their versatility stems from various strategies by which these can be coupled with multiple alkyl or aryl substitutes. The models that were examined consisted of MARS, PLS, SPLS, KPLS, MARS-SPLS, MARS-Kernel-PLS, and an innovative method called MARS-PLS-Lasso, which combines the traditional MARS algorithm with partial least squares and Lasso regularization. To assess the efficacy of the proposed method, we used a high-dimensional spectral data set comprising 19 samples and 1627 predictors. To capture nonlinear interactions in the data, MARS-PLS-Lasso improves the conventional MARS approach by creating adaptive basis functions for each predictor. Lasso regularization was used to choose the most pertinent basis functions and make sure that only the most important predictors were kept. Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) were used on train and test datasets to evaluate the prediction performance. The MARS-PLS-Lasso model outperformed the typical MARS (RMSE = 30.48, MAE = 23.46) and PLS (RMSE = 14.00, MAE = 11.90) models by achieving the lowest test RMSE of 13.00 and MAE of 10.55. When we performed simulation study, MARS-PLS-LASSO again performed the best among basis-integrated models in terms of both low and high correlated data, with the lowest RMSE (0.4708) and MAE (0.2812) in case of data with dimensions 20, 50 and RMSE (0.685, 0.4806) and MAE (0.1325, 0.3819) using data with dimensions 20, 5000 respectively. These results show that the best way to model complicated relationships in high-dimensional data is to use MARS-PLS-Lasso to improve predictive accuracy.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105418"},"PeriodicalIF":3.7,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144071931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信