{"title":"Threshold Feature Selection PCA","authors":"Felipe de Melo Battisti, T. B. A. de Carvalho","doi":"10.5753/kdmile.2022.227718","DOIUrl":null,"url":null,"abstract":"Classification algorithms encounter learning difficulties when data has non-discriminant features. Dimensionality reduction techniques such as PCA are commonly applied. However, PCA has the disadvantage of being an unsupervised method, ignoring relevant class information on data. Therefore, this paper proposes the Threshold Feature Selector (TFS), a new supervised dimensionality reduction method that employs class thresholds to select more relevant features. We also present the Threshold PCA (TPCA), a combination of our supervised technique with standard PCA. During experiments, TFS achieved higher accuracy in 90% of the datasets compared with the original data. The second proposed technique, TPCA, outperformed the standard PCA in accuracy gain in 70% of the datasets.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/kdmile.2022.227718","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Classification algorithms encounter learning difficulties when data has non-discriminant features. Dimensionality reduction techniques such as PCA are commonly applied. However, PCA has the disadvantage of being an unsupervised method, ignoring relevant class information on data. Therefore, this paper proposes the Threshold Feature Selector (TFS), a new supervised dimensionality reduction method that employs class thresholds to select more relevant features. We also present the Threshold PCA (TPCA), a combination of our supervised technique with standard PCA. During experiments, TFS achieved higher accuracy in 90% of the datasets compared with the original data. The second proposed technique, TPCA, outperformed the standard PCA in accuracy gain in 70% of the datasets.