有监督与无监督特征选择方法的比较

International Conference on Pattern Recognition Applications and Methods Pub Date : 2019-02-19 DOI:10.5220/0007385305820589

L. Haar, K. Anding, K. Trambitckii, G. Notni

{"title":"有监督与无监督特征选择方法的比较","authors":"L. Haar, K. Anding, K. Trambitckii, G. Notni","doi":"10.5220/0007385305820589","DOIUrl":null,"url":null,"abstract":"The reduction of the feature set by selecting relevant features for the classification process is an important step within the image processing chain, but sometimes too little attention is paid to it. Such a reduction has many advantages. It can remove irrelevant and redundant data, improve recognition performance, reduce storage capacity requirements, computational time of calculations and also the complexity of the model. Within this paper supervised and unsupervised feature selection methods are compared with respect to the achievable recognition accuracy. Supervised Methods include information of the given classes in the selection, whereas unsupervised ones can be used for tasks without known class labels. Feature clustering is an unsupervised method. For this type of feature reduction, mainly hierarchical methods, but also k-means are used. Instead of this two clustering methods, the Expectation Maximization (EM) algorithm was used in this paper. The aim is to investigate whether this type of clustering algorithm can provide a proper feature vector using feature clustering. There is no feature reduction technique that provides equally best results for all datasets and classifiers. However, for all datasets, it was possible to reduce the feature set to a specific number of useful features without losses and often even with improvements in recognition performance.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Comparison between Supervised and Unsupervised Feature Selection Methods\",\"authors\":\"L. Haar, K. Anding, K. Trambitckii, G. Notni\",\"doi\":\"10.5220/0007385305820589\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The reduction of the feature set by selecting relevant features for the classification process is an important step within the image processing chain, but sometimes too little attention is paid to it. Such a reduction has many advantages. It can remove irrelevant and redundant data, improve recognition performance, reduce storage capacity requirements, computational time of calculations and also the complexity of the model. Within this paper supervised and unsupervised feature selection methods are compared with respect to the achievable recognition accuracy. Supervised Methods include information of the given classes in the selection, whereas unsupervised ones can be used for tasks without known class labels. Feature clustering is an unsupervised method. For this type of feature reduction, mainly hierarchical methods, but also k-means are used. Instead of this two clustering methods, the Expectation Maximization (EM) algorithm was used in this paper. The aim is to investigate whether this type of clustering algorithm can provide a proper feature vector using feature clustering. There is no feature reduction technique that provides equally best results for all datasets and classifiers. However, for all datasets, it was possible to reduce the feature set to a specific number of useful features without losses and often even with improvements in recognition performance.\",\"PeriodicalId\":410036,\"journal\":{\"name\":\"International Conference on Pattern Recognition Applications and Methods\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Pattern Recognition Applications and Methods\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0007385305820589\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Pattern Recognition Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0007385305820589","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

通过选择相关特征对特征集进行约简进行分类是图像处理链中的一个重要步骤，但有时人们对它的关注太少。这样的降价有很多好处。它可以去除不相关和冗余的数据，提高识别性能，降低对存储容量的要求，减少计算的计算时间，降低模型的复杂度。本文比较了有监督和无监督两种特征选择方法的识别精度。监督方法包括选择中给定类的信息，而非监督方法可用于没有已知类标签的任务。特征聚类是一种无监督方法。对于这种类型的特征约简，主要使用分层方法，但也使用k-means。本文采用期望最大化(EM)算法代替这两种聚类方法。目的是研究这种类型的聚类算法是否可以使用特征聚类提供合适的特征向量。没有一种特征约简技术可以为所有数据集和分类器提供相同的最佳结果。然而，对于所有数据集，都可以将特征集减少到特定数量的有用特征而不会损失，甚至可以提高识别性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Comparison between Supervised and Unsupervised Feature Selection Methods

The reduction of the feature set by selecting relevant features for the classification process is an important step within the image processing chain, but sometimes too little attention is paid to it. Such a reduction has many advantages. It can remove irrelevant and redundant data, improve recognition performance, reduce storage capacity requirements, computational time of calculations and also the complexity of the model. Within this paper supervised and unsupervised feature selection methods are compared with respect to the achievable recognition accuracy. Supervised Methods include information of the given classes in the selection, whereas unsupervised ones can be used for tasks without known class labels. Feature clustering is an unsupervised method. For this type of feature reduction, mainly hierarchical methods, but also k-means are used. Instead of this two clustering methods, the Expectation Maximization (EM) algorithm was used in this paper. The aim is to investigate whether this type of clustering algorithm can provide a proper feature vector using feature clustering. There is no feature reduction technique that provides equally best results for all datasets and classifiers. However, for all datasets, it was possible to reduce the feature set to a specific number of useful features without losses and often even with improvements in recognition performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Conference on Pattern Recognition Applications and Methods

自引率

0.00%

发文量