基于鲁棒稀疏PCA的烟草质量异常点检测:优点与局限性

J. Huo, Changtong Lu, Yongfeng Yang, Hong-Mei Guo, Chenggang Li, Qian Li, Xuebin Zhao, Huaiqi Li
{"title":"基于鲁棒稀疏PCA的烟草质量异常点检测:优点与局限性","authors":"J. Huo, Changtong Lu, Yongfeng Yang, Hong-Mei Guo, Chenggang Li, Qian Li, Xuebin Zhao, Huaiqi Li","doi":"10.1109/ICSESS54813.2022.9930311","DOIUrl":null,"url":null,"abstract":"Quality control is important for tobacco industry and tobacco leaf is the source material for cigarettes product. For a certain brand’s products, without known standard samples as center, it is difficult to detect outliers of unknown groups with classical PCA. Although classical PCA has been widely used in NIRS for tobacco, the accuracy of classical PCA can not satisfy the industrial requirements to correctly classify the products and identify the outliers. Therefore the robust sparse PCA (RSPCA) here is used for tobacco leaf NIR process, which has advantages over both robust PCA (RPCA) and classical PCA (CPCA) that the RSPCA can suppress the effect of outliers through sparse loadings and has robust dimension projection. Thus RSPCA brings in higher accuracy for tobacco leaf source classification and outlier detection compared to classical PCA. With Eigenvalue Decomposition Discriminant Analysis (EDDA), a Gaussian component based supervised classification method, the tobacco leaf sources from different quality levels are well classified according to the robust score distance(SD) and orthogonal distance(OD) of RSPCA. Furthermore, the principal components (PCs) based classification and SD-OD based classification are also compared between the three types of PCA, which shows the RSPCA SD-OD based classification has the best performance.","PeriodicalId":265412,"journal":{"name":"2022 IEEE 13th International Conference on Software Engineering and Service Science (ICSESS)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Quality Outlier Detection for Tobacco Based on Robust Sparse PCA: Advantages and Limitations\",\"authors\":\"J. Huo, Changtong Lu, Yongfeng Yang, Hong-Mei Guo, Chenggang Li, Qian Li, Xuebin Zhao, Huaiqi Li\",\"doi\":\"10.1109/ICSESS54813.2022.9930311\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Quality control is important for tobacco industry and tobacco leaf is the source material for cigarettes product. For a certain brand’s products, without known standard samples as center, it is difficult to detect outliers of unknown groups with classical PCA. Although classical PCA has been widely used in NIRS for tobacco, the accuracy of classical PCA can not satisfy the industrial requirements to correctly classify the products and identify the outliers. Therefore the robust sparse PCA (RSPCA) here is used for tobacco leaf NIR process, which has advantages over both robust PCA (RPCA) and classical PCA (CPCA) that the RSPCA can suppress the effect of outliers through sparse loadings and has robust dimension projection. Thus RSPCA brings in higher accuracy for tobacco leaf source classification and outlier detection compared to classical PCA. With Eigenvalue Decomposition Discriminant Analysis (EDDA), a Gaussian component based supervised classification method, the tobacco leaf sources from different quality levels are well classified according to the robust score distance(SD) and orthogonal distance(OD) of RSPCA. Furthermore, the principal components (PCs) based classification and SD-OD based classification are also compared between the three types of PCA, which shows the RSPCA SD-OD based classification has the best performance.\",\"PeriodicalId\":265412,\"journal\":{\"name\":\"2022 IEEE 13th International Conference on Software Engineering and Service Science (ICSESS)\",\"volume\":\"63 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 13th International Conference on Software Engineering and Service Science (ICSESS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSESS54813.2022.9930311\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 13th International Conference on Software Engineering and Service Science (ICSESS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSESS54813.2022.9930311","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

烟叶是卷烟产品的原料,质量控制是烟草业的重要工作。对于某品牌的产品,没有已知的标准样本作为中心,经典PCA很难检测出未知组的异常值。虽然经典主成分分析法已广泛应用于烟草近红外光谱分析,但经典主成分分析法的精度不能满足正确分类产品和识别异常值的工业要求。因此,本文将鲁棒稀疏主成分分析(robust sparse PCA, RSPCA)用于烟叶近红外分析,该方法具有鲁棒稀疏主成分分析(robust sparse PCA, RPCA)和经典主成分分析(classical PCA, CPCA)所不能及的优点,即RSPCA可以通过稀疏加载抑制异常值的影响,并且具有鲁棒的维数投影。因此,与传统PCA相比,RSPCA在烟叶源分类和离群值检测方面具有更高的精度。采用基于高斯分量的监督分类方法——特征值分解判别分析(EDDA),根据RSPCA的稳健评分距离(SD)和正交距离(OD)对不同质量水平的烟叶源进行了分类。对比了基于主成分(PCs)的主成分分类和基于SD-OD的主成分分类,结果表明基于RSPCA SD-OD的主成分分类具有最佳的分类性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Quality Outlier Detection for Tobacco Based on Robust Sparse PCA: Advantages and Limitations
Quality control is important for tobacco industry and tobacco leaf is the source material for cigarettes product. For a certain brand’s products, without known standard samples as center, it is difficult to detect outliers of unknown groups with classical PCA. Although classical PCA has been widely used in NIRS for tobacco, the accuracy of classical PCA can not satisfy the industrial requirements to correctly classify the products and identify the outliers. Therefore the robust sparse PCA (RSPCA) here is used for tobacco leaf NIR process, which has advantages over both robust PCA (RPCA) and classical PCA (CPCA) that the RSPCA can suppress the effect of outliers through sparse loadings and has robust dimension projection. Thus RSPCA brings in higher accuracy for tobacco leaf source classification and outlier detection compared to classical PCA. With Eigenvalue Decomposition Discriminant Analysis (EDDA), a Gaussian component based supervised classification method, the tobacco leaf sources from different quality levels are well classified according to the robust score distance(SD) and orthogonal distance(OD) of RSPCA. Furthermore, the principal components (PCs) based classification and SD-OD based classification are also compared between the three types of PCA, which shows the RSPCA SD-OD based classification has the best performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信