Journal of Chemometrics最新文献_第3页

A Method for Measuring Similarity or Distance of Molecular and Arbitrary Graphs Based on a Collection of Topological Indices 一种基于拓扑指数集合的分子图和任意图相似性或距离度量方法

IF 2.1 4区化学

Journal of Chemometrics Pub Date : 2025-07-15 DOI: 10.1002/cem.70047

Mert Sinan Oz

{"title":"A Method for Measuring Similarity or Distance of Molecular and Arbitrary Graphs Based on a Collection of Topological Indices","authors":"Mert Sinan Oz","doi":"10.1002/cem.70047","DOIUrl":"10.1002/cem.70047","url":null,"abstract":"<div>\u0000 \u0000 The comparison of graphs using various types of quantitative structural similarity or distance measures has an important place in many scientific disciplines. Two of these are cheminformatics and chemical graph theory, in which the structural similarity or distance measures between molecular graphs are analyzed by calculating the Jaccard/Tanimoto index based on molecular fingerprints. A novel method is proposed to measure the structural similarity or distance for molecular and arbitrary graphs. This method calculates the Jaccard/Tanimoto index based on a collection of topological indices embedded in the entries of a vector. We statistically compare the proposed method with the method for calculating the Jaccard/Tanimoto indices based on five different molecular fingerprints on alkane and cycloalkane isomers. Furthermore, to explore how the method works on non-molecular graphs, we statistically analyze it on the set of all connected graphs with seven vertices. The Jaccard/Tanimoto index values produced by the proposed method cover the value domain. In addition, it provides a discrete similarity distribution with the clustering, which makes the differences clear and provides convenience for comparison. Two outstanding features of the proposed method are its applicability to arbitrary graphs and the computational complexity of the algorithm used in the method is polynomial over the number of graphs and the number of vertices and edges of the graphs.\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144624520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MultANOVA Followed by Post Hoc Analyses for Designed High-Dimensional Data: A Comprehensive Framework That Outperforms ASCA, rMANOVA, and VASCA 设计高维数据的事后分析：优于ASCA、rMANOVA和VASCA的综合框架

IF 2.1 4区化学

Journal of Chemometrics Pub Date : 2025-07-14 DOI: 10.1002/cem.70039

Benjamin Mahieu, Véronique Cariou

{"title":"MultANOVA Followed by Post Hoc Analyses for Designed High-Dimensional Data: A Comprehensive Framework That Outperforms ASCA, rMANOVA, and VASCA","authors":"Benjamin Mahieu, Véronique Cariou","doi":"10.1002/cem.70039","DOIUrl":"10.1002/cem.70039","url":null,"abstract":"Analytical platforms generate high-dimensional data, where the number of variables usually exceeds the number of observations. Such data are frequently derived from an experimental design, where samples have been collected to identify potential variation in the factors or interactions of interest. To circumvent issues related to large data sizes when evaluating factor and interaction effects, ANOVA simultaneous component analysis (ASCA), regularized multivariate analysis of variance (rMANOVA), and variable selection ASCA (VASCA) have been proposed previously. However, they require computationally intensive methods to test the effects of factors and interactions. In the present paper, multiple ANOVAs (MultANOVA) is proposed as a simple yet effective alternative to the above methods. MultANOVA has the advantage of being direct and fast, as it does not rely on intensive calculation methods, while incorporating a variable selection strategy. This method entails the execution of multiple ANOVAs, one per variable, with multiple test corrections. Subsequent post hoc analyses are also introduced. These encompass multiple least-squares difference tests (MultLSD) for the pairwise comparison of multivariate least-squares means and diagonal canonical discriminant analysis (DCDA) with approximate confidence ellipses to visualize significant effects. MultANOVA is compared to the aforementioned methods based on simulations, which demonstrate that it holds the nominal alpha risk as opposed to rMANOVA and VASCA, while being more powerful than ASCA and VASCA. Even though MultANOVA is proven less powerful than VASCA for variable selection, it has been demonstrated to hold the nominal risk, whereas VASCA does not. Finally, the MultANOVA framework is illustrated based on metagenomics, metabolomics, and spectroscopic data.","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144624299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Classification Limit of Detection: Estimating Sample-Level Classification Uncertainty in Spectroscopy Using Monte Carlo Error Propagation of Spectral Noise 检测的分类极限：利用光谱噪声的蒙特卡罗误差传播估计光谱中样本级分类不确定度

IF 2.1 4区化学

Journal of Chemometrics Pub Date : 2025-07-12 DOI: 10.1002/cem.70048

Helder V. Carneiro, Caelin P. Celani, Karl S. Booksh

{"title":"The Classification Limit of Detection: Estimating Sample-Level Classification Uncertainty in Spectroscopy Using Monte Carlo Error Propagation of Spectral Noise","authors":"Helder V. Carneiro, Caelin P. Celani, Karl S. Booksh","doi":"10.1002/cem.70048","DOIUrl":"10.1002/cem.70048","url":null,"abstract":"<div>\u0000 \u0000 This study presents a novel Monte Carlo–based methodology for estimating classification uncertainty in chemometric models by propagating spectral measurement noise. Unlike traditional approaches that treat classification as deterministic, this method simulates realistic noise structures, both independent and correlated, captured from multiple spectrum measurements to quantify sample-specific uncertainty. The technique is applicable to both linear and non-linear models, including partial least squares discriminant analysis (PLS-DA) and various support vector machine (SVM) kernels. The methodology was validated using three datasets: synthetic 2D simulations for controlled model geometry, X-ray fluorescence (XRF) spectra from colored glass rods, and laser-induced breakdown spectroscopy (LIBS) data from Dalbergia wood species. Results revealed that uncertainty increases with spectral similarity and perpendicular alignment between noise structures and decision boundaries. In real-world applications, classification metrics alone proved insufficient to assess model reliability. The inclusion of uncertainty intervals enabled identification of ambiguous predictions even in cases of perfect classification accuracy. This work advances chemometric analysis by linking measurement uncertainty to classification outcomes, offering a robust framework for decision-making in high-stakes analytical contexts.\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144606699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Dynamic Iterative Data Cleaning Strategy Based on Model Feedback to Enhance the Prediction Accuracy of Nanocellulose Emulsions 基于模型反馈的动态迭代数据清洗策略提高纳米纤维素乳剂的预测精度

IF 2.1 4区化学

Journal of Chemometrics Pub Date : 2025-07-12 DOI: 10.1002/cem.70046

Long Wang, Zi'ang Xia, Yao Zhang, Xiaoyu Liu, Chaojie Li, Xue Li, Jiahao Dai, Mingshun Bi, Jingxue Yang, Heng Zhang

{"title":"A Dynamic Iterative Data Cleaning Strategy Based on Model Feedback to Enhance the Prediction Accuracy of Nanocellulose Emulsions","authors":"Long Wang, Zi'ang Xia, Yao Zhang, Xiaoyu Liu, Chaojie Li, Xue Li, Jiahao Dai, Mingshun Bi, Jingxue Yang, Heng Zhang","doi":"10.1002/cem.70046","DOIUrl":"10.1002/cem.70046","url":null,"abstract":"<div>\u0000 \u0000 The effectiveness of artificial neural networks, which were key technologies in artificial intelligence, greatly depends on the quality of the input data. Data cleaning, a crucial component of data preprocessing, played a vital role in enhancing the accuracy, robustness, and generalization capabilities of neural network models. In this study, a Feedback-Driven Iterative Cleaning (FDIC) framework, guided by model performance, was developed and applied to the study of droplet size prediction models for nanocellulose-stabilized Pickering emulsion systems. After randomly removing between 1% and 40% of the data, an artificial neural network model was established using CNC particle size (X1), CNC concentration (X2), and the oil–water volume ratio of CNC to oil-phase monomer (X3) as input variables, with emulsion droplet size (Y) as the quantitative index. The model's accuracy was evaluated after data removal using the coefficient of determination (R2), mean squared error (MSE), and mean absolute scaling error (MASE). The main finding was that targeted removal of a small portion of the data significantly improved the predictive power of the model. Specifically, removing 5% of the dataset results in optimal performance, with R2 improving from 0.5307 without cleaning to 0.7258, with an MSE of 183.4917, and MASE of 0.4060. This result suggested a significant and quantifiable improvement in the accuracy of the model through our iterative cleaning process. The study revealed a nonlinear relationship between the number of iterations and the model's generalization ability. This finding offered a novel methodological tool for data governance in the smart era and demonstrates significant value in dynamic environments.\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144606753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Nondestructive Identification of Paper Based on Relative Formation Time Using Three-Dimensional Fluorescence Spectroscopy Combined With Supervised Learning 基于相对形成时间的三维荧光光谱与监督学习相结合的纸张无损识别

IF 2.1 4区化学

Journal of Chemometrics Pub Date : 2025-07-11 DOI: 10.1002/cem.70043

Xiaohong Chen, Yuhuan He, Lan Cui, Hongda Li, Xiaojing Wu

{"title":"Nondestructive Identification of Paper Based on Relative Formation Time Using Three-Dimensional Fluorescence Spectroscopy Combined With Supervised Learning","authors":"Xiaohong Chen, Yuhuan He, Lan Cui, Hongda Li, Xiaojing Wu","doi":"10.1002/cem.70043","DOIUrl":"10.1002/cem.70043","url":null,"abstract":"<div>\u0000 \u0000 In order to achieve nondestructive analysis and identification of the relative formation time of paper evidence and to solve the difficulties in document authenticity identification in the field of forensic science, this study selected three-dimensional fluorescence spectroscopy data of paper evidence of the same brand and model collected in the same storage environment within the last decade (2012–2023). After preprocessing steps like eliminating scattering, smoothing noise and principal component analysis (PCA), machine learning algorithms such as K-nearest neighbor (KNN) and linear discriminant analysis (LDA) were employed to classify and predict specific feature bands. The accuracy of KNN and LDA was 94.5% and 98.9%, respectively. Furthermore, relative formation time prediction was conducted for paper samples by LDA in the sample library, achieving an accuracy rate of 98.0%. Finally, the established model was successfully applied to analyze an actual case involving suspected “forged official documents.” It accurately determined the relative formation time of the forged paper, and the analysis results were consistent with the suspect's confession.\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144598426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

XAI-2DCOS: Enhancing Interpretability in Spectral Deep Learning Models Through 2D Correlation Spectroscopy XAI-2DCOS：通过二维相关光谱增强光谱深度学习模型的可解释性

IF 2.1 4区化学

Journal of Chemometrics Pub Date : 2025-07-11 DOI: 10.1002/cem.70045

Jhonatan Contreras, Thomas Bocklitz

{"title":"XAI-2DCOS: Enhancing Interpretability in Spectral Deep Learning Models Through 2D Correlation Spectroscopy","authors":"Jhonatan Contreras, Thomas Bocklitz","doi":"10.1002/cem.70045","DOIUrl":"10.1002/cem.70045","url":null,"abstract":"Deep learning (DL) has significantly advanced Raman spectra analysis, achieving high accuracy and efficiency. However, their complexity and opacity limit their application in areas where understanding and transparency are essential. To address this, we present XAI-2DCOS, an innovative eXplainable Artificial Intelligence (XAI) framework that employs 2D correlation spectroscopy (2DCOS). Traditionally, 2DCOS reveals the sequence of molecular changes under varying conditions. We repurpose it to enhance the interpretability of DL models by linking changes in spectral features to model outputs, identifying critical wavenumbers, and how their variations affect model accuracy. We applied XAI-2DCOS to a DL model trained on a dataset of oil Raman spectra, demonstrating its ability to identify critical spectral features that align with domain knowledge. To improve robustness, we integrated a conditional generative adversarial network (CGAN) for data augmentation. CGAN generates synthetic data, ensuring the presence of spectra across the entire probability range. A normalized relevance score quantifies the contribution for each wavenumber to the model's prediction. A predictive probability map delineates decision boundaries within the PCA space. Synchronous 2DCOS maps are used to guide spectral adjustments that improve prediction confidence for specific class predictions. These adjustments can affect multiple output classes with differential scaling of output activations, suggesting that crossing a threshold can shift the model decision. Our results show that XAI-2DCOS improves the interpretability and reliability of DL models applied to Raman spectra. Furthermore, CGAN data augmentation extends the applicability of XAI-2DCOS to smaller datasets.","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70045","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144598427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Editorial: Honoring Prof. Age K. Smilde 社论：纪念Age K. Smilde教授

IF 2.1 4区化学

Journal of Chemometrics Pub Date : 2025-07-10 DOI: 10.1002/cem.70052

Rasmus Bro

{"title":"Editorial: Honoring Prof. Age K. Smilde","authors":"Rasmus Bro","doi":"10.1002/cem.70052","DOIUrl":"10.1002/cem.70052","url":null,"abstract":"It is both a privilege and an emotional moment for me to write this editorial for the special issue of the Journal of Chemometrics honoring Prof. Age K. Smilde, who recently retired. For me, and for countless others in our field, Prof. Smilde (also more informally know as Age) has been more than a scholar; he has been a mentor, a collaborator, and an inspiration whose contributions have left a huge mark on the world of chemometrics.Looking back, it feels almost surreal to think of my early days in academia, 30 years ago, when I was navigating the complex world of multi-way tensor analysis. At the time, Age seemed to me to be the quintessential ‘all-knowing’ professor. His mastery of the field, combined with a willingness to mentor and nurture young scientists, made a profound difference in my career. I remember a conference where he explained the complexity of tensor rank. I quickly grasped the problem and slightly arrogantly said: I will fix it. I tried. I was very fast and 100% wrong. I never managed to make even the slightest progress!He played a pivotal role in helping me craft some of my earliest papers, including one of the first approaches to tensor regression. Our discussions on the properties of multi-way arrays and their applications remain etched in my memory—not just as lessons in science, but as moments of shared curiosity.Age's career is nothing short of extraordinary. From his foundational work at the University of Groningen to his tenure at the University of Amsterdam, where he led the group later known as Biosystems Data Analysis, Age has consistently been at the forefront of methodological advancements in not just chemometrics. His work on multi-way analysis, data integration, and systems biology has truly shaped the respective fields. It is no surprise that he has been honored with numerous awards, such as the prestigious Herman Wold Gold Medal and the Kowalski Award, reflecting his pioneering contributions and global recognition.What sets Age apart, is his ability to foster collaboration and build bridges within the scientific community. He introduced me to some of the most significant researchers not only in chemometrics but also in psychometrics, widening my horizons and opening doors that would otherwise have remained closed. His efforts to create platforms for collaboration, such as co-founding TRICAP and contributing to international chemometric meetings, have enriched our discipline.Reflecting on the arc of our careers, I cannot help but smile at the realization that the ‘old’ professor who once seemed so far ahead of me is, in fact, only a few years my senior. Time has a way of leveling us, and today I count Age as not only a colleague but also a dear friend and peer. His wisdom, humility, and warmth continue to inspire, and his legacy will undoubtedly endure through the countless students, collaborators, and researchers he has influenced.This special issue is a testam","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70052","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144589613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Accurate and Rational Collision Cross Section Prediction Using Voxel-Projected Area and Deep Learning 基于体素投影面积和深度学习的准确、合理的碰撞截面预测

IF 2.1 4区化学

Journal of Chemometrics Pub Date : 2025-07-08 DOI: 10.1002/cem.70040

Jiongyu Wang, Yuxuan Liao, Ting Xie, Ruixi Chen, Jiahui Lai, Zhimin Zhang, Hongmei Lu

{"title":"Accurate and Rational Collision Cross Section Prediction Using Voxel-Projected Area and Deep Learning","authors":"Jiongyu Wang, Yuxuan Liao, Ting Xie, Ruixi Chen, Jiahui Lai, Zhimin Zhang, Hongmei Lu","doi":"10.1002/cem.70040","DOIUrl":"10.1002/cem.70040","url":null,"abstract":"<div>\u0000 \u0000 Ion mobility spectrometry–mass spectrometry (IMS-MS) enables rapid acquisition of collision cross section (CCS), a critical physicochemical property for analyte characterization. Despite CCS being theoretically defined as the rotationally averaged projected area of 3D atomic spheres, existing models have underutilized this geometric insight. Here, we present a projected area–based CCS prediction method (PACCS). It integrates voxel-projected area approximation, graph neural network (GNN)–extracted features, and m/z to achieve accurate and rational CCS prediction. A voxel-based algorithm efficiently calculates molecular projected areas by leveraging Fibonacci grids sampling and discretizing 3D conformers into voxel grids. PACCS demonstrates exceptional performance, achieving a median relative error (MedRE) of 1.03% and a coefficient of determination (R2) of 0.994 on the test set. External test set against AllCCS2, GraphCCS, SigmaCCS, CCSbase, and DeepCCS highlights the superiority of PACCS, with 80.1% of predictions exhibiting < 3% error. Notably, PACCS exhibits broad applicability across diverse molecular types, including environmental contaminants (R2 = 0.954–0.979) and structurally complex phycotoxins (R2 = 0.961), highlighting the superiority of PACCS in robustness and versatility. Computational efficiency is enhanced via parallelization, enabling large-scale CCS database generation (e.g., 5.9 million entries for ChEMBL within 10 h). Ablation studies confirm the pivotal role of voxel-projected areas (Pearson correlation coefficients > 0.988), while stability analyses reveal minimal sensitivity to conformational variability (standard deviation of R2 is 0.00003). PACCS provides an open-source, scalable solution for expanding CCS databases, advancing compound identification in metabolomics and environmental analysis.\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144574152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Frequency-Domain Alignment of Heterogeneous, Multidimensional Separations Data Through Complex Orthogonal Procrustes Analysis 基于复正交Procrustes分析的异构、多维分离数据频域对齐

IF 2.1 4区化学

Journal of Chemometrics Pub Date : 2025-07-07 DOI: 10.1002/cem.70042

Michael Sorochan Armstrong

{"title":"Frequency-Domain Alignment of Heterogeneous, Multidimensional Separations Data Through Complex Orthogonal Procrustes Analysis","authors":"Michael Sorochan Armstrong","doi":"10.1002/cem.70042","DOIUrl":"10.1002/cem.70042","url":null,"abstract":"Multidimensional separations data have the capacity to reveal detailed information about complex biological samples. However, data analysis has been an ongoing challenge in the area because the peaks that represent chemical factors may drift over the course of several analytical runs along the first- and second-dimension retention times. This makes higher level analyses of the data difficult, because a 1–1 comparison of samples is seldom possible without sophisticated preprocessing routines. This work offers a very simple solution to the alignment problem through an orthogonal Procrustes analysis of the frequency-domain representation of the data, which for each coefficient relative drift and amplitude are represented as a complex number. Its performance on synthetically generated data presenting nonlinear retention distortions is evaluated, in addition to its applicability to quantitative problems using experimental calibration, and untargeted metabolomics data. This analysis is extremely simple and can be recreated using just a few lines of code, relying only on fast algorithms for matrix multiplication and Fourier transforms.","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70042","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144573560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Self-Optimizing Radial Basis Function Support Vector Classifier (SO-RBFSVC) 自优化径向基函数支持向量分类器SO-RBFSVC

IF 2.1 4区化学

Journal of Chemometrics Pub Date : 2025-05-26 DOI: 10.1002/cem.70038

Qudus Ayodeji Thanni, Peter de Boves Harrington

{"title":"Self-Optimizing Radial Basis Function Support Vector Classifier (SO-RBFSVC)","authors":"Qudus Ayodeji Thanni, Peter de Boves Harrington","doi":"10.1002/cem.70038","DOIUrl":"10.1002/cem.70038","url":null,"abstract":"Support vector classifiers (SVCs) typically use radial basis function (RBF) kernels to map data into higher dimensional spaces that may improve the linear separation of otherwise nonseparable classes. We present a novel self-optimizing radial basis function support vector classifier (SO-RBFSVC) that integrates response surface methodology (RSM), two-dimensional cubic spline interpolation, and bootstrapped Latin partitions (BLPs) for automated hyperparameter tuning. The SO-RBFSVC simultaneously optimizes the RBF kernel width (σ) and cost parameter (C) using an interpolated response surface obtained from generalized prediction accuracies. The SO-RBFSVC was compared to other self-optimizing classifiers (super SVC [sSVC] and super partial least squares discriminant analysis [sPLS-DA]). Four datasets were evaluated: (i) hemp and marijuana discrimination using proton nuclear magnetic resonance spectra, (ii) barley growth location using near-infrared spectra, (iii) glass-type identification based on elemental composition, and (iv) wine cultivar classification from physicochemical properties. External validation results showed that SO-RBFSVC performed comparably to the other models, achieving error rates of 0.4 ± 0.5% for hemp/marijuana, 7 ± 1% for glass, and 6 ± 1% for wine, while outperforming the linear models with 10 ± 1% error for the barley NIR data. For the first time, generalized sensitivity analysis (GSA) was applied to quantify model linearity. GSA revealed high nonlinearity in the barley dataset, justifying a nonlinear model. The SO-RBFSVC provides robust, automated classifier tuning for low- and high-dimensional datasets, offering ease of use.","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 6","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70038","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144140398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0