{"title":"降维、信息丰富的视觉表示,用于场景分类","authors":"Kaveri A. Thakoor","doi":"10.1109/DSP-SPE.2015.7369525","DOIUrl":null,"url":null,"abstract":"We present a reduced dimensionality, information rich (RDIR) visual representation for scene information that distills the most distinguishing elements in an image, enabling scene classification by humans and computers under reduced dimensionality conditions. The representation utilizes the Gist model [1] to convey scene information in low bandwidth conditions, exhibiting enhanced classification performance for humans and computers compared to the current downsampling method used by the Retinal Prosthesis System [2], which restores partial vision for people without sight. We show that as few as 6-pixel, 3-bit images are sufficient for successful classification by humans of 4 classes within the Natural Scene Dataset [3]. Human and computer classification accuracy on RDIR scenes is consistently higher than that on downsampled (DS) (spatially averaged) scenes. While DS scenes may seem more intuitive to interpret since spatial layout is preserved in them, we show that the dimensionality reduction via Principal Components Analysis (PCA) following Gist processing enables distinguishability for 6-dimensional RDIR images. We conduct a short trade-off study for human learning vs. SVM classification and conclude with application of the RDIR technique to classification of 6 locations on the University of Southern California (USC) campus.","PeriodicalId":91992,"journal":{"name":"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)","volume":"59 1","pages":"43-48"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Reduced dimensionality, information rich visual representations for scene classification\",\"authors\":\"Kaveri A. Thakoor\",\"doi\":\"10.1109/DSP-SPE.2015.7369525\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a reduced dimensionality, information rich (RDIR) visual representation for scene information that distills the most distinguishing elements in an image, enabling scene classification by humans and computers under reduced dimensionality conditions. The representation utilizes the Gist model [1] to convey scene information in low bandwidth conditions, exhibiting enhanced classification performance for humans and computers compared to the current downsampling method used by the Retinal Prosthesis System [2], which restores partial vision for people without sight. We show that as few as 6-pixel, 3-bit images are sufficient for successful classification by humans of 4 classes within the Natural Scene Dataset [3]. Human and computer classification accuracy on RDIR scenes is consistently higher than that on downsampled (DS) (spatially averaged) scenes. While DS scenes may seem more intuitive to interpret since spatial layout is preserved in them, we show that the dimensionality reduction via Principal Components Analysis (PCA) following Gist processing enables distinguishability for 6-dimensional RDIR images. We conduct a short trade-off study for human learning vs. SVM classification and conclude with application of the RDIR technique to classification of 6 locations on the University of Southern California (USC) campus.\",\"PeriodicalId\":91992,\"journal\":{\"name\":\"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)\",\"volume\":\"59 1\",\"pages\":\"43-48\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSP-SPE.2015.7369525\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSP-SPE.2015.7369525","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Reduced dimensionality, information rich visual representations for scene classification
We present a reduced dimensionality, information rich (RDIR) visual representation for scene information that distills the most distinguishing elements in an image, enabling scene classification by humans and computers under reduced dimensionality conditions. The representation utilizes the Gist model [1] to convey scene information in low bandwidth conditions, exhibiting enhanced classification performance for humans and computers compared to the current downsampling method used by the Retinal Prosthesis System [2], which restores partial vision for people without sight. We show that as few as 6-pixel, 3-bit images are sufficient for successful classification by humans of 4 classes within the Natural Scene Dataset [3]. Human and computer classification accuracy on RDIR scenes is consistently higher than that on downsampled (DS) (spatially averaged) scenes. While DS scenes may seem more intuitive to interpret since spatial layout is preserved in them, we show that the dimensionality reduction via Principal Components Analysis (PCA) following Gist processing enables distinguishability for 6-dimensional RDIR images. We conduct a short trade-off study for human learning vs. SVM classification and conclude with application of the RDIR technique to classification of 6 locations on the University of Southern California (USC) campus.