Xiaoyu Liu , Xiaokang Liu , Jiawei Wang , Daidi Zang , Yang Yang , Qinhua Chen , De-an Guo
{"title":"Machine learning and chemometric methods for high-throughput authentication of 53 Root and Rhizome Chinese Herbal using ATR-FTIR fingerprints","authors":"Xiaoyu Liu , Xiaokang Liu , Jiawei Wang , Daidi Zang , Yang Yang , Qinhua Chen , De-an Guo","doi":"10.1016/j.jchromb.2025.124630","DOIUrl":null,"url":null,"abstract":"<div><div>To address the identification challenges caused by morphological similarities in Root and Rhizome Chinese Herbal (RRCH), this study developed a discrimination system integrating Attenuated Total Reflectance Fourier Transform Infrared Spectroscopy (ATR-FTIR) with multimodal machine learning. 53 kinds of RRCH collected from China were analyzed using ATR-FTIR to acquire spectral fingerprints. An innovative analytical framework was established, combining chemometric Partial Least Squares Discriminant Analysis (PLS-DA) with optimized machine learning models: t-distributed Stochastic Neighbor Embedding (t-SNE), optimized decision trees, optimized discriminant analysis, naive Bayes, optimized SVM, optimized KNN, SVM kernels, and optimized ensemble learning. Multivariate analysis revealed distinct spatial distribution patterns of chemical characteristics among the 53 RRCH species. t-SNE projections demonstrated significant cluster separation in two-dimensional feature space, confirming strong correlations between spectral fingerprints and phytochemical compositions. The SVM model outperformed others, achieving 100 % classification accuracy on both training and validation sets, with a markedly shorter identification time compared to PLS-DA. This ATR-FTIR-machine learning hybrid system enables high-throughput authentication of RRCH and establishes a scalable technical framework for herbal quality standardization. The methodology provides critical insights into chemical marker discovery through vibrational spectrum-feature relationship mapping, advancing intelligent discrimination of botanically similar medicinal materials.</div></div>","PeriodicalId":348,"journal":{"name":"Journal of Chromatography B","volume":"1260 ","pages":"Article 124630"},"PeriodicalIF":2.8000,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chromatography B","FirstCategoryId":"1","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570023225001849","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
To address the identification challenges caused by morphological similarities in Root and Rhizome Chinese Herbal (RRCH), this study developed a discrimination system integrating Attenuated Total Reflectance Fourier Transform Infrared Spectroscopy (ATR-FTIR) with multimodal machine learning. 53 kinds of RRCH collected from China were analyzed using ATR-FTIR to acquire spectral fingerprints. An innovative analytical framework was established, combining chemometric Partial Least Squares Discriminant Analysis (PLS-DA) with optimized machine learning models: t-distributed Stochastic Neighbor Embedding (t-SNE), optimized decision trees, optimized discriminant analysis, naive Bayes, optimized SVM, optimized KNN, SVM kernels, and optimized ensemble learning. Multivariate analysis revealed distinct spatial distribution patterns of chemical characteristics among the 53 RRCH species. t-SNE projections demonstrated significant cluster separation in two-dimensional feature space, confirming strong correlations between spectral fingerprints and phytochemical compositions. The SVM model outperformed others, achieving 100 % classification accuracy on both training and validation sets, with a markedly shorter identification time compared to PLS-DA. This ATR-FTIR-machine learning hybrid system enables high-throughput authentication of RRCH and establishes a scalable technical framework for herbal quality standardization. The methodology provides critical insights into chemical marker discovery through vibrational spectrum-feature relationship mapping, advancing intelligent discrimination of botanically similar medicinal materials.
期刊介绍:
The Journal of Chromatography B publishes papers on developments in separation science relevant to biology and biomedical research including both fundamental advances and applications. Analytical techniques which may be considered include the various facets of chromatography, electrophoresis and related methods, affinity and immunoaffinity-based methodologies, hyphenated and other multi-dimensional techniques, and microanalytical approaches. The journal also considers articles reporting developments in sample preparation, detection techniques including mass spectrometry, and data handling and analysis.
Developments related to preparative separations for the isolation and purification of components of biological systems may be published, including chromatographic and electrophoretic methods, affinity separations, field flow fractionation and other preparative approaches.
Applications to the analysis of biological systems and samples will be considered when the analytical science contains a significant element of novelty, e.g. a new approach to the separation of a compound, novel combination of analytical techniques, or significantly improved analytical performance.