{"title":"High dimensional exploration: A comparison of PCA, distance concentration, and classification performance in two fMRI datasets","authors":"J. Etzel, T. Braver","doi":"10.1109/CIDM.2014.7008662","DOIUrl":null,"url":null,"abstract":"fMRI (functional magnetic resonance imaging) studies frequently create high dimensional datasets, with far more features (voxels) than examples. It is known that such datasets frequently have properties that make analysis challenging, such as concentration of distances. Here, we calculated the probability of distance concentration and proportion of variance explained by PCA in two fMRI datasets, comparing these measures with each other, as well as with the number of voxels and classification accuracy. There are clear differences between the datasets, with one showing levels of distance concentration comparable to those reported in microarray data [1, 2]. While it remains to be determined how typical these results are, they suggest that problematic levels of distance concentration in fMRI datasets may not be a rare occurrence.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIDM.2014.7008662","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
fMRI (functional magnetic resonance imaging) studies frequently create high dimensional datasets, with far more features (voxels) than examples. It is known that such datasets frequently have properties that make analysis challenging, such as concentration of distances. Here, we calculated the probability of distance concentration and proportion of variance explained by PCA in two fMRI datasets, comparing these measures with each other, as well as with the number of voxels and classification accuracy. There are clear differences between the datasets, with one showing levels of distance concentration comparable to those reported in microarray data [1, 2]. While it remains to be determined how typical these results are, they suggest that problematic levels of distance concentration in fMRI datasets may not be a rare occurrence.