Perception Engine Using a Multi-Sensor Head to Enable High-level Humanoid Robot Behaviors

Bhavyansh Mishra, Duncan Calvert, Brendon Ortolano, M. Asselmeier, Luke Fina, Stephen McCrory, H. Sevil, Robert J. Griffin
{"title":"Perception Engine Using a Multi-Sensor Head to Enable High-level Humanoid Robot Behaviors","authors":"Bhavyansh Mishra, Duncan Calvert, Brendon Ortolano, M. Asselmeier, Luke Fina, Stephen McCrory, H. Sevil, Robert J. Griffin","doi":"10.1109/icra46639.2022.9812178","DOIUrl":null,"url":null,"abstract":"For achieving significant levels of autonomy, legged robot behaviors require perceptual awareness of both the terrain for traversal, as well as structures and objects in their surroundings for planning, obstacle avoidance, and high-level decision making. In this work, we present a perception engine for legged robots that extracts the necessary information for developing semantic, contextual, and metric awareness of their surroundings. Our custom sensor configuration consists of (1) an active depth sensor, (2) two monocular cameras looking sideways, (3) a passive stereo sensor observing the terrain, (4) a forward facing active depth camera, and (5) a rotating 3D LIDAR with a large vertical field-of-view (FOV). The mutual overlap in the sensors' FOVs allows us to redundantly detect and track objects of both dynamic and static types. We fuse class masks generated by a semantic segmentation model with LIDAR and depth data to accurately identify and track individual instances of dynamically moving objects. In parallel, active depth and passive stereo streams of the terrain are also fused to map the terrain using the on-board GPU. We evaluate the engine using two different humanoid behaviors, (1) look-and-step and (2) track-and-follow, on the Boston Dynamics Atlas.","PeriodicalId":341244,"journal":{"name":"2022 International Conference on Robotics and Automation (ICRA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icra46639.2022.9812178","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

For achieving significant levels of autonomy, legged robot behaviors require perceptual awareness of both the terrain for traversal, as well as structures and objects in their surroundings for planning, obstacle avoidance, and high-level decision making. In this work, we present a perception engine for legged robots that extracts the necessary information for developing semantic, contextual, and metric awareness of their surroundings. Our custom sensor configuration consists of (1) an active depth sensor, (2) two monocular cameras looking sideways, (3) a passive stereo sensor observing the terrain, (4) a forward facing active depth camera, and (5) a rotating 3D LIDAR with a large vertical field-of-view (FOV). The mutual overlap in the sensors' FOVs allows us to redundantly detect and track objects of both dynamic and static types. We fuse class masks generated by a semantic segmentation model with LIDAR and depth data to accurately identify and track individual instances of dynamically moving objects. In parallel, active depth and passive stereo streams of the terrain are also fused to map the terrain using the on-board GPU. We evaluate the engine using two different humanoid behaviors, (1) look-and-step and (2) track-and-follow, on the Boston Dynamics Atlas.
使用多传感器头的感知引擎实现高级人形机器人行为
为了实现显著的自主性,有腿机器人的行为需要对穿越的地形以及周围的结构和物体进行感知,以进行规划、避障和高级决策。在这项工作中,我们提出了一种用于有腿机器人的感知引擎,该引擎可以提取必要的信息,以发展对周围环境的语义、上下文和度量意识。我们定制的传感器配置包括(1)一个主动深度传感器,(2)两个侧视的单目摄像头,(3)一个观察地形的被动立体传感器,(4)一个面向前方的主动深度摄像头,以及(5)一个具有大垂直视场(FOV)的旋转3D激光雷达。传感器fov的相互重叠使我们能够冗余地检测和跟踪动态和静态类型的对象。我们将语义分割模型生成的类掩码与激光雷达和深度数据融合在一起,以准确识别和跟踪动态移动物体的单个实例。同时,地形的主动深度和被动立体流也被融合到使用车载GPU绘制地形。我们在波士顿动力地图集上使用两种不同的类人行为(1)观察-步和(2)跟踪-跟随)来评估引擎。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信