降维、信息丰富的视觉表示,用于场景分类

Kaveri A. Thakoor
{"title":"降维、信息丰富的视觉表示,用于场景分类","authors":"Kaveri A. Thakoor","doi":"10.1109/DSP-SPE.2015.7369525","DOIUrl":null,"url":null,"abstract":"We present a reduced dimensionality, information rich (RDIR) visual representation for scene information that distills the most distinguishing elements in an image, enabling scene classification by humans and computers under reduced dimensionality conditions. The representation utilizes the Gist model [1] to convey scene information in low bandwidth conditions, exhibiting enhanced classification performance for humans and computers compared to the current downsampling method used by the Retinal Prosthesis System [2], which restores partial vision for people without sight. We show that as few as 6-pixel, 3-bit images are sufficient for successful classification by humans of 4 classes within the Natural Scene Dataset [3]. Human and computer classification accuracy on RDIR scenes is consistently higher than that on downsampled (DS) (spatially averaged) scenes. While DS scenes may seem more intuitive to interpret since spatial layout is preserved in them, we show that the dimensionality reduction via Principal Components Analysis (PCA) following Gist processing enables distinguishability for 6-dimensional RDIR images. We conduct a short trade-off study for human learning vs. SVM classification and conclude with application of the RDIR technique to classification of 6 locations on the University of Southern California (USC) campus.","PeriodicalId":91992,"journal":{"name":"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)","volume":"59 1","pages":"43-48"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Reduced dimensionality, information rich visual representations for scene classification\",\"authors\":\"Kaveri A. Thakoor\",\"doi\":\"10.1109/DSP-SPE.2015.7369525\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a reduced dimensionality, information rich (RDIR) visual representation for scene information that distills the most distinguishing elements in an image, enabling scene classification by humans and computers under reduced dimensionality conditions. The representation utilizes the Gist model [1] to convey scene information in low bandwidth conditions, exhibiting enhanced classification performance for humans and computers compared to the current downsampling method used by the Retinal Prosthesis System [2], which restores partial vision for people without sight. We show that as few as 6-pixel, 3-bit images are sufficient for successful classification by humans of 4 classes within the Natural Scene Dataset [3]. Human and computer classification accuracy on RDIR scenes is consistently higher than that on downsampled (DS) (spatially averaged) scenes. While DS scenes may seem more intuitive to interpret since spatial layout is preserved in them, we show that the dimensionality reduction via Principal Components Analysis (PCA) following Gist processing enables distinguishability for 6-dimensional RDIR images. We conduct a short trade-off study for human learning vs. SVM classification and conclude with application of the RDIR technique to classification of 6 locations on the University of Southern California (USC) campus.\",\"PeriodicalId\":91992,\"journal\":{\"name\":\"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)\",\"volume\":\"59 1\",\"pages\":\"43-48\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSP-SPE.2015.7369525\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSP-SPE.2015.7369525","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

我们提出了一种场景信息的降维、信息丰富(RDIR)视觉表示,它提取了图像中最显著的元素,使人类和计算机能够在降维条件下对场景进行分类。该表示利用Gist模型[1]在低带宽条件下传递场景信息,与目前视网膜假体系统[2]使用的下采样方法相比,显示出更强的人类和计算机分类性能,可以恢复失明人的部分视力。我们表明,只要6像素,3位图像就足以在自然场景数据集[3]中成功分类4个类别。人类和计算机在RDIR场景上的分类准确率始终高于下采样(DS)(空间平均)场景。虽然DS场景可能看起来更直观,因为空间布局保留在其中,但我们表明,通过Gist处理后的主成分分析(PCA)降维可以实现6维RDIR图像的可区分性。我们对人类学习与支持向量机分类进行了短暂的权衡研究,并将RDIR技术应用于南加州大学(USC)校园的6个地点的分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Reduced dimensionality, information rich visual representations for scene classification
We present a reduced dimensionality, information rich (RDIR) visual representation for scene information that distills the most distinguishing elements in an image, enabling scene classification by humans and computers under reduced dimensionality conditions. The representation utilizes the Gist model [1] to convey scene information in low bandwidth conditions, exhibiting enhanced classification performance for humans and computers compared to the current downsampling method used by the Retinal Prosthesis System [2], which restores partial vision for people without sight. We show that as few as 6-pixel, 3-bit images are sufficient for successful classification by humans of 4 classes within the Natural Scene Dataset [3]. Human and computer classification accuracy on RDIR scenes is consistently higher than that on downsampled (DS) (spatially averaged) scenes. While DS scenes may seem more intuitive to interpret since spatial layout is preserved in them, we show that the dimensionality reduction via Principal Components Analysis (PCA) following Gist processing enables distinguishability for 6-dimensional RDIR images. We conduct a short trade-off study for human learning vs. SVM classification and conclude with application of the RDIR technique to classification of 6 locations on the University of Southern California (USC) campus.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信