Multiview random forest of local experts combining RGB and LIDAR data for pedestrian detection

Alejandro González, Gabriel Villalonga, Jiaolong Xu, David Vázquez, J. Amores, Antonio M. López
{"title":"Multiview random forest of local experts combining RGB and LIDAR data for pedestrian detection","authors":"Alejandro González, Gabriel Villalonga, Jiaolong Xu, David Vázquez, J. Amores, Antonio M. López","doi":"10.1109/IVS.2015.7225711","DOIUrl":null,"url":null,"abstract":"Despite recent significant advances, pedestrian detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities and a strong multi-view classifier that accounts for different pedestrian views and poses. In this paper we provide an extensive evaluation that gives insight into how each of these aspects (multi-cue, multi-modality and strong multi-view classifier) affect performance both individually and when integrated together. In the multi-modality component we explore the fusion of RGB and depth maps obtained by high-definition LIDAR, a type of modality that is only recently starting to receive attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the performance, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient. These simple blocks can be easily replaced with more sophisticated ones recently proposed, such as the use of convolutional neural networks for feature representation, to further improve the accuracy.","PeriodicalId":294701,"journal":{"name":"2015 IEEE Intelligent Vehicles Symposium (IV)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"81","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Intelligent Vehicles Symposium (IV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IVS.2015.7225711","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 81

Abstract

Despite recent significant advances, pedestrian detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities and a strong multi-view classifier that accounts for different pedestrian views and poses. In this paper we provide an extensive evaluation that gives insight into how each of these aspects (multi-cue, multi-modality and strong multi-view classifier) affect performance both individually and when integrated together. In the multi-modality component we explore the fusion of RGB and depth maps obtained by high-definition LIDAR, a type of modality that is only recently starting to receive attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the performance, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient. These simple blocks can be easily replaced with more sophisticated ones recently proposed, such as the use of convolutional neural networks for feature representation, to further improve the accuracy.
结合RGB和LIDAR数据进行行人检测的本地专家多视图随机森林
尽管最近取得了重大进展,但在真实场景中,行人检测仍然是一个极具挑战性的问题。为了开发一种能够在这些条件下成功运行的探测器,利用多种线索、多种成像模式和强大的多视图分类器来考虑不同的行人视图和姿势变得至关重要。在本文中,我们提供了一个广泛的评估,深入了解了这些方面(多线索、多模态和强多视图分类器)分别如何影响性能和集成在一起时的性能。在多模态组件中,我们探索了高清晰度激光雷达获得的RGB和深度图的融合,这是一种最近才开始受到关注的模态。正如我们的分析所揭示的,尽管上述所有方面都有助于提高性能,但可见光谱和深度信息的融合可以更大程度地提高精度。由此产生的检测器不仅在具有挑战性的KITTI基准测试中名列前茅,而且它建立在非常简单的块上,易于实现且计算效率高。这些简单的块可以很容易地替换为最近提出的更复杂的块,例如使用卷积神经网络进行特征表示,以进一步提高准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信