高斯RAM:基于随机视网膜启发的一瞥和强化学习的轻量级图像分类

D. Shim, H. Kim
{"title":"高斯RAM:基于随机视网膜启发的一瞥和强化学习的轻量级图像分类","authors":"D. Shim, H. Kim","doi":"10.23919/ICCAS50221.2020.9268201","DOIUrl":null,"url":null,"abstract":"Previous studies on image classification have mainly focused on the performance of the networks, not on real-time operation or model compression. We propose a Gaussian Deep Recurrent visual Attention Model (GDRAM) - a reinforcement learning based lightweight deep neural network for large scale image classification that outperforms the conventional CNN (Convolutional Neural Network) which uses the entire image as input. Highly inspired by the biological visual recognition process, our model mimics the stochastic location of the retina with Gaussian distribution. We evaluate the model on Large cluttered MNIST, Large CIFAR-10 and Large CIFAR-100 datasets which are resized to 128 in both width and height. The implementation of Gaussian RAM in PyTorch and its pretrained model are available at : https://github.com/dsshim0125/gaussian-ram","PeriodicalId":6732,"journal":{"name":"2020 20th International Conference on Control, Automation and Systems (ICCAS)","volume":"85 1","pages":"155-160"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Gaussian RAM: Lightweight Image Classification via Stochastic Retina-Inspired Glimpse and Reinforcement Learning\",\"authors\":\"D. Shim, H. Kim\",\"doi\":\"10.23919/ICCAS50221.2020.9268201\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Previous studies on image classification have mainly focused on the performance of the networks, not on real-time operation or model compression. We propose a Gaussian Deep Recurrent visual Attention Model (GDRAM) - a reinforcement learning based lightweight deep neural network for large scale image classification that outperforms the conventional CNN (Convolutional Neural Network) which uses the entire image as input. Highly inspired by the biological visual recognition process, our model mimics the stochastic location of the retina with Gaussian distribution. We evaluate the model on Large cluttered MNIST, Large CIFAR-10 and Large CIFAR-100 datasets which are resized to 128 in both width and height. The implementation of Gaussian RAM in PyTorch and its pretrained model are available at : https://github.com/dsshim0125/gaussian-ram\",\"PeriodicalId\":6732,\"journal\":{\"name\":\"2020 20th International Conference on Control, Automation and Systems (ICCAS)\",\"volume\":\"85 1\",\"pages\":\"155-160\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 20th International Conference on Control, Automation and Systems (ICCAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/ICCAS50221.2020.9268201\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 20th International Conference on Control, Automation and Systems (ICCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ICCAS50221.2020.9268201","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

以往的图像分类研究主要关注网络的性能,而不是实时操作或模型压缩。我们提出了一种高斯深度循环视觉注意模型(GDRAM)——一种基于强化学习的轻量级深度神经网络,用于大规模图像分类,优于使用整个图像作为输入的传统CNN(卷积神经网络)。受生物视觉识别过程的启发,我们的模型模拟了视网膜的随机位置与高斯分布。我们在大型杂乱的MNIST、大型CIFAR-10和大型CIFAR-100数据集上对模型进行了评估,这些数据集的宽度和高度都被调整为128。PyTorch中高斯内存的实现及其预训练模型可在:https://github.com/dsshim0125/gaussian-ram获得
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Gaussian RAM: Lightweight Image Classification via Stochastic Retina-Inspired Glimpse and Reinforcement Learning
Previous studies on image classification have mainly focused on the performance of the networks, not on real-time operation or model compression. We propose a Gaussian Deep Recurrent visual Attention Model (GDRAM) - a reinforcement learning based lightweight deep neural network for large scale image classification that outperforms the conventional CNN (Convolutional Neural Network) which uses the entire image as input. Highly inspired by the biological visual recognition process, our model mimics the stochastic location of the retina with Gaussian distribution. We evaluate the model on Large cluttered MNIST, Large CIFAR-10 and Large CIFAR-100 datasets which are resized to 128 in both width and height. The implementation of Gaussian RAM in PyTorch and its pretrained model are available at : https://github.com/dsshim0125/gaussian-ram
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信