基于体积表示的三维物体识别卷积神经网络

Xiaofang Xu, A. Dehghani, D. Corrigan, Sam Caulfield, D. Moloney
{"title":"基于体积表示的三维物体识别卷积神经网络","authors":"Xiaofang Xu, A. Dehghani, D. Corrigan, Sam Caulfield, D. Moloney","doi":"10.1109/SPLIM.2016.7528403","DOIUrl":null,"url":null,"abstract":"Following the success of Convolutional Neural Networks (CNNs) on object recognition using 2D images, they are extended in this paper to process 3D data. Nearly most of current systems require huge amount of computation for dealing with large amount of data. In this paper, an efficient 3D volumetric object representation, Volumetric Accelerator (VOLA), is presented which requires much less memory than the normal volumetric representations. On this basis, a few 3D digit datasets using 2D MNIST and 2D digit fonts with different rotations along the x, y, and z axis are introduced. Finally, we introduce a combination of multiple CNN models based on the famous LeNet model. The trained CNN models based on the generated dataset have achieved the average accuracy of 90.30% and 81.85% for 3D-MNIST and 3D-Fonts datasets, respectively. Experimental results show that VOLA-based CNNs perform 1.5x faster than the original LeNet.","PeriodicalId":297318,"journal":{"name":"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Convolutional Neural Network for 3D object recognition using volumetric representation\",\"authors\":\"Xiaofang Xu, A. Dehghani, D. Corrigan, Sam Caulfield, D. Moloney\",\"doi\":\"10.1109/SPLIM.2016.7528403\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Following the success of Convolutional Neural Networks (CNNs) on object recognition using 2D images, they are extended in this paper to process 3D data. Nearly most of current systems require huge amount of computation for dealing with large amount of data. In this paper, an efficient 3D volumetric object representation, Volumetric Accelerator (VOLA), is presented which requires much less memory than the normal volumetric representations. On this basis, a few 3D digit datasets using 2D MNIST and 2D digit fonts with different rotations along the x, y, and z axis are introduced. Finally, we introduce a combination of multiple CNN models based on the famous LeNet model. The trained CNN models based on the generated dataset have achieved the average accuracy of 90.30% and 81.85% for 3D-MNIST and 3D-Fonts datasets, respectively. Experimental results show that VOLA-based CNNs perform 1.5x faster than the original LeNet.\",\"PeriodicalId\":297318,\"journal\":{\"name\":\"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPLIM.2016.7528403\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPLIM.2016.7528403","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

继卷积神经网络(cnn)在使用二维图像的目标识别上取得成功之后,本文将其扩展到处理三维数据。目前几乎大多数系统都需要大量的计算来处理大量的数据。本文提出了一种高效的三维体积物体表示方法——体积加速器(VOLA),它比常规的体积物体表示方法需要更少的内存。在此基础上,介绍了几个使用2D MNIST和2D数字字体沿x、y、z轴不同旋转的三维数字数据集。最后,我们在著名的LeNet模型的基础上引入了多个CNN模型的组合。基于生成的数据集训练的CNN模型在3D-MNIST和3D-Fonts数据集上的平均准确率分别达到了90.30%和81.85%。实验结果表明,基于vola的cnn的运行速度比原来的LeNet快1.5倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Convolutional Neural Network for 3D object recognition using volumetric representation
Following the success of Convolutional Neural Networks (CNNs) on object recognition using 2D images, they are extended in this paper to process 3D data. Nearly most of current systems require huge amount of computation for dealing with large amount of data. In this paper, an efficient 3D volumetric object representation, Volumetric Accelerator (VOLA), is presented which requires much less memory than the normal volumetric representations. On this basis, a few 3D digit datasets using 2D MNIST and 2D digit fonts with different rotations along the x, y, and z axis are introduced. Finally, we introduce a combination of multiple CNN models based on the famous LeNet model. The trained CNN models based on the generated dataset have achieved the average accuracy of 90.30% and 81.85% for 3D-MNIST and 3D-Fonts datasets, respectively. Experimental results show that VOLA-based CNNs perform 1.5x faster than the original LeNet.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信