Xiaoli Wang , Xiaolei Ma , Yuxin Liu, Wenhan Tao, Yuting Zuo, Yueqin Zhu, Feng Hua, Chanming Liu, Wei Huang
{"title":"Integrated Metabolomics-KPCA-Machine Learning framework: a solution for geographical traceability of Chinese Jujube","authors":"Xiaoli Wang , Xiaolei Ma , Yuxin Liu, Wenhan Tao, Yuting Zuo, Yueqin Zhu, Feng Hua, Chanming Liu, Wei Huang","doi":"10.1016/j.fochx.2025.103069","DOIUrl":null,"url":null,"abstract":"<div><div>Due to widespread product adulteration, Chinese jujube (CJ), a crop of global economic importance with nutritional and medicinal properties, struggles with geographical traceability. The study introduced a Metabolomics-Kernel Principal Component Analysis (KPCA)-Machine Learning (ML) framework to set up an origin identification system for CJ from six production regions in China (Xinjiang, Gansu, Shaanxi, Henan, Shandong, and Hebei). Using LC-MS/MS for untargeted metabolomics, researchers identified 312 metabolites. Multivariate analysis revealed 37 key discriminant variables (VIP > 1). KPCA compressed these features into 28 principal components (retaining 90.59 % information). Compared with the traditional method, the K-means clustering after dimensionality reduction of KPCA greatly improves the sample differentiation ability: the origin samples with original data overlap with fuzzy boundaries; while after dimensionality reduction, the six origin samples form a clear and compact cluster, which achieves accurate classification. This study pioneers a “Metabolomics-KPCA-ML” paradigm, offering a solution for traceability of geographical indication agricultural products.</div></div>","PeriodicalId":12334,"journal":{"name":"Food Chemistry: X","volume":"31 ","pages":"Article 103069"},"PeriodicalIF":8.2000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Food Chemistry: X","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590157525009162","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
Due to widespread product adulteration, Chinese jujube (CJ), a crop of global economic importance with nutritional and medicinal properties, struggles with geographical traceability. The study introduced a Metabolomics-Kernel Principal Component Analysis (KPCA)-Machine Learning (ML) framework to set up an origin identification system for CJ from six production regions in China (Xinjiang, Gansu, Shaanxi, Henan, Shandong, and Hebei). Using LC-MS/MS for untargeted metabolomics, researchers identified 312 metabolites. Multivariate analysis revealed 37 key discriminant variables (VIP > 1). KPCA compressed these features into 28 principal components (retaining 90.59 % information). Compared with the traditional method, the K-means clustering after dimensionality reduction of KPCA greatly improves the sample differentiation ability: the origin samples with original data overlap with fuzzy boundaries; while after dimensionality reduction, the six origin samples form a clear and compact cluster, which achieves accurate classification. This study pioneers a “Metabolomics-KPCA-ML” paradigm, offering a solution for traceability of geographical indication agricultural products.
期刊介绍:
Food Chemistry: X, one of three Open Access companion journals to Food Chemistry, follows the same aims, scope, and peer-review process. It focuses on papers advancing food and biochemistry or analytical methods, prioritizing research novelty. Manuscript evaluation considers novelty, scientific rigor, field advancement, and reader interest. Excluded are studies on food molecular sciences or disease cure/prevention. Topics include food component chemistry, bioactives, processing effects, additives, contaminants, and analytical methods. The journal welcome Analytical Papers addressing food microbiology, sensory aspects, and more, emphasizing new methods with robust validation and applicability to diverse foods or regions.