Localized Deep Extreme Learning Machines for Efficient RGB-D Object Recognition

2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2015-11-01 DOI:10.1109/DICTA.2015.7371280

H. F. Zaki, F. Shafait, A. Mian

{"title":"Localized Deep Extreme Learning Machines for Efficient RGB-D Object Recognition","authors":"H. F. Zaki, F. Shafait, A. Mian","doi":"10.1109/DICTA.2015.7371280","DOIUrl":null,"url":null,"abstract":"Existing RGB-D object recognition methods either use channel specific handcrafted features, or learn features with deep networks. The former lack representation ability while the latter require large amounts of training data and learning time. In real-time robotics applications involving RGB-D sensors, we do not have the luxury of both. In this paper, we propose Localized Deep Extreme Learning Machines (LDELM) that efficiently learn features from RGB-D data. By using localized patches, not only is the problem of data sparsity solved, but the learned features are robust to occlusions and viewpoint variations. LDELM learns deep localized features in an unsupervised way from random patches of the training data. Each image is then feed-forwarded, patch-wise, through the LDELM to form a cuboid of features. The cuboid is divided into cells and pooled to get the final compact image representation which is then used to train an ELM classifier. Experiments on the benchmark Washington RGB-D and 2D3D datasets show that the proposed algorithm not only is significantly faster to train but also outperforms state-of-the-art methods in terms of accuracy and classification time.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA.2015.7371280","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

Existing RGB-D object recognition methods either use channel specific handcrafted features, or learn features with deep networks. The former lack representation ability while the latter require large amounts of training data and learning time. In real-time robotics applications involving RGB-D sensors, we do not have the luxury of both. In this paper, we propose Localized Deep Extreme Learning Machines (LDELM) that efficiently learn features from RGB-D data. By using localized patches, not only is the problem of data sparsity solved, but the learned features are robust to occlusions and viewpoint variations. LDELM learns deep localized features in an unsupervised way from random patches of the training data. Each image is then feed-forwarded, patch-wise, through the LDELM to form a cuboid of features. The cuboid is divided into cells and pooled to get the final compact image representation which is then used to train an ELM classifier. Experiments on the benchmark Washington RGB-D and 2D3D datasets show that the proposed algorithm not only is significantly faster to train but also outperforms state-of-the-art methods in terms of accuracy and classification time.

查看原文本刊更多论文

高效RGB-D对象识别的局部深度极限学习机

现有的RGB-D对象识别方法要么使用通道特定的手工特征，要么使用深度网络学习特征。前者缺乏表征能力，而后者需要大量的训练数据和学习时间。在涉及RGB-D传感器的实时机器人应用中，我们无法同时拥有两者。在本文中，我们提出了局部深度极限学习机(LDELM)，可以有效地从RGB-D数据中学习特征。通过使用局部化补丁，不仅解决了数据稀疏性问题，而且学习到的特征对遮挡和视点变化具有鲁棒性。LDELM以无监督的方式从训练数据的随机块中学习深度局部特征。然后，通过LDELM对每个图像进行逐块转发，形成一个长方体的特征。将长方体划分为单元并进行池化以获得最终的紧凑图像表示，然后用于训练ELM分类器。在基准Washington RGB-D和2D3D数据集上的实验表明，该算法不仅训练速度快，而且在准确率和分类时间方面优于最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)

自引率

0.00%

发文量