{"title":"Analysis of sparse PCA using high dimensional data","authors":"F. R. On, R. Jailani, S. Hassan, N. Tahir","doi":"10.1109/CSPA.2016.7515857","DOIUrl":null,"url":null,"abstract":"In this study the Sparse Principal Component Analysis (PCA) has been chosen as feature extraction and further compared with the conventional PCA technique with six UCI Machine Learning high dimensionality data as database. Results attained showed that both PCA and Sparse PCA techniques are indeed suitable as feature extraction for high dimensional data since the accuracy rate attained are higher as compared to the original data as inputs to the classifier. However, the inconsistency in determining the number of PCs to be retained is ascertained and this is the drawback of PCA technique despite its greater accuracy rate. Meanwhile, the Sparse PCA retained the original number of principal components (PCs) with sparse loadings that are mainly zero but do not produce promising result with all the datasets. The Sparse PCA technique needs to be applied to suitable high dimensional dataset to gain its fullness accuracy and efficiency.","PeriodicalId":314829,"journal":{"name":"2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSPA.2016.7515857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
In this study the Sparse Principal Component Analysis (PCA) has been chosen as feature extraction and further compared with the conventional PCA technique with six UCI Machine Learning high dimensionality data as database. Results attained showed that both PCA and Sparse PCA techniques are indeed suitable as feature extraction for high dimensional data since the accuracy rate attained are higher as compared to the original data as inputs to the classifier. However, the inconsistency in determining the number of PCs to be retained is ascertained and this is the drawback of PCA technique despite its greater accuracy rate. Meanwhile, the Sparse PCA retained the original number of principal components (PCs) with sparse loadings that are mainly zero but do not produce promising result with all the datasets. The Sparse PCA technique needs to be applied to suitable high dimensional dataset to gain its fullness accuracy and efficiency.