降维、信息丰富的视觉表示，用于场景分类

2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE) Pub Date : 2015-08-01 DOI:10.1109/DSP-SPE.2015.7369525

Kaveri A. Thakoor

{"title":"降维、信息丰富的视觉表示，用于场景分类","authors":"Kaveri A. Thakoor","doi":"10.1109/DSP-SPE.2015.7369525","DOIUrl":null,"url":null,"abstract":"We present a reduced dimensionality, information rich (RDIR) visual representation for scene information that distills the most distinguishing elements in an image, enabling scene classification by humans and computers under reduced dimensionality conditions. The representation utilizes the Gist model [1] to convey scene information in low bandwidth conditions, exhibiting enhanced classification performance for humans and computers compared to the current downsampling method used by the Retinal Prosthesis System [2], which restores partial vision for people without sight. We show that as few as 6-pixel, 3-bit images are sufficient for successful classification by humans of 4 classes within the Natural Scene Dataset [3]. Human and computer classification accuracy on RDIR scenes is consistently higher than that on downsampled (DS) (spatially averaged) scenes. While DS scenes may seem more intuitive to interpret since spatial layout is preserved in them, we show that the dimensionality reduction via Principal Components Analysis (PCA) following Gist processing enables distinguishability for 6-dimensional RDIR images. We conduct a short trade-off study for human learning vs. SVM classification and conclude with application of the RDIR technique to classification of 6 locations on the University of Southern California (USC) campus.","PeriodicalId":91992,"journal":{"name":"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)","volume":"59 1","pages":"43-48"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Reduced dimensionality, information rich visual representations for scene classification\",\"authors\":\"Kaveri A. Thakoor\",\"doi\":\"10.1109/DSP-SPE.2015.7369525\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a reduced dimensionality, information rich (RDIR) visual representation for scene information that distills the most distinguishing elements in an image, enabling scene classification by humans and computers under reduced dimensionality conditions. The representation utilizes the Gist model [1] to convey scene information in low bandwidth conditions, exhibiting enhanced classification performance for humans and computers compared to the current downsampling method used by the Retinal Prosthesis System [2], which restores partial vision for people without sight. We show that as few as 6-pixel, 3-bit images are sufficient for successful classification by humans of 4 classes within the Natural Scene Dataset [3]. Human and computer classification accuracy on RDIR scenes is consistently higher than that on downsampled (DS) (spatially averaged) scenes. While DS scenes may seem more intuitive to interpret since spatial layout is preserved in them, we show that the dimensionality reduction via Principal Components Analysis (PCA) following Gist processing enables distinguishability for 6-dimensional RDIR images. We conduct a short trade-off study for human learning vs. SVM classification and conclude with application of the RDIR technique to classification of 6 locations on the University of Southern California (USC) campus.\",\"PeriodicalId\":91992,\"journal\":{\"name\":\"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)\",\"volume\":\"59 1\",\"pages\":\"43-48\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSP-SPE.2015.7369525\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSP-SPE.2015.7369525","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

我们提出了一种场景信息的降维、信息丰富(RDIR)视觉表示，它提取了图像中最显著的元素，使人类和计算机能够在降维条件下对场景进行分类。该表示利用Gist模型[1]在低带宽条件下传递场景信息，与目前视网膜假体系统[2]使用的下采样方法相比，显示出更强的人类和计算机分类性能，可以恢复失明人的部分视力。我们表明，只要6像素，3位图像就足以在自然场景数据集[3]中成功分类4个类别。人类和计算机在RDIR场景上的分类准确率始终高于下采样(DS)(空间平均)场景。虽然DS场景可能看起来更直观，因为空间布局保留在其中，但我们表明，通过Gist处理后的主成分分析(PCA)降维可以实现6维RDIR图像的可区分性。我们对人类学习与支持向量机分类进行了短暂的权衡研究，并将RDIR技术应用于南加州大学(USC)校园的6个地点的分类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reduced dimensionality, information rich visual representations for scene classification

We present a reduced dimensionality, information rich (RDIR) visual representation for scene information that distills the most distinguishing elements in an image, enabling scene classification by humans and computers under reduced dimensionality conditions. The representation utilizes the Gist model [1] to convey scene information in low bandwidth conditions, exhibiting enhanced classification performance for humans and computers compared to the current downsampling method used by the Retinal Prosthesis System [2], which restores partial vision for people without sight. We show that as few as 6-pixel, 3-bit images are sufficient for successful classification by humans of 4 classes within the Natural Scene Dataset [3]. Human and computer classification accuracy on RDIR scenes is consistently higher than that on downsampled (DS) (spatially averaged) scenes. While DS scenes may seem more intuitive to interpret since spatial layout is preserved in them, we show that the dimensionality reduction via Principal Components Analysis (PCA) following Gist processing enables distinguishability for 6-dimensional RDIR images. We conduct a short trade-off study for human learning vs. SVM classification and conclude with application of the RDIR technique to classification of 6 locations on the University of Southern California (USC) campus.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)

自引率

0.00%

发文量