An RGB/Infra-Red camera fusion approach for Multi-Person Pose Estimation in low light environments

2020 IEEE Sensors Applications Symposium (SAS) Pub Date : 2020-03-01 DOI:10.1109/SAS48726.2020.9220059

Viviana Crescitelli, Atsutake Kosuge, T. Oshima

{"title":"An RGB/Infra-Red camera fusion approach for Multi-Person Pose Estimation in low light environments","authors":"Viviana Crescitelli, Atsutake Kosuge, T. Oshima","doi":"10.1109/SAS48726.2020.9220059","DOIUrl":null,"url":null,"abstract":"Enabling collaborative robots to predict the human pose is a challenging, but important issue to address. Most of the development of human pose estimation (HPE) adopt RGB images as input to estimate anatomical keypoints with Deep Convolutional Neural Networks (DNNs). However, those approaches neglect the challenge of detecting features reliably during night-time or in difficult lighting conditions, leading to safety issues. In response to this limitation, we present in this paper an RGB/Infra-Red camera fusion approach, based on the open-source library OpenPose, and we show how the fusion of keypoints extracted from different images can be used to improve the human pose estimation performance in sparse light environments. Specifically, OpenPose is used to extract body joints from RGB and Infra-Red images and the contribution of each frame is combined by a fusion step. We investigate the potential of a fusion framework based on Deep Neural Networks and we compare it to a linear weighted average method. The proposed approach shows promising performances, with the best result outperforming conventional methods by a factor 1.8x on a custom data set of Infra-Red and RGB images captured in poor light conditions, where it is hard to recognize people even by human inspection.","PeriodicalId":223737,"journal":{"name":"2020 IEEE Sensors Applications Symposium (SAS)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Sensors Applications Symposium (SAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAS48726.2020.9220059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Enabling collaborative robots to predict the human pose is a challenging, but important issue to address. Most of the development of human pose estimation (HPE) adopt RGB images as input to estimate anatomical keypoints with Deep Convolutional Neural Networks (DNNs). However, those approaches neglect the challenge of detecting features reliably during night-time or in difficult lighting conditions, leading to safety issues. In response to this limitation, we present in this paper an RGB/Infra-Red camera fusion approach, based on the open-source library OpenPose, and we show how the fusion of keypoints extracted from different images can be used to improve the human pose estimation performance in sparse light environments. Specifically, OpenPose is used to extract body joints from RGB and Infra-Red images and the contribution of each frame is combined by a fusion step. We investigate the potential of a fusion framework based on Deep Neural Networks and we compare it to a linear weighted average method. The proposed approach shows promising performances, with the best result outperforming conventional methods by a factor 1.8x on a custom data set of Infra-Red and RGB images captured in poor light conditions, where it is hard to recognize people even by human inspection.

查看原文本刊更多论文

低光环境下RGB/红外相机融合多人姿态估计方法

使协作机器人能够预测人类的姿势是一个具有挑战性的问题，但也是一个需要解决的重要问题。人体姿态估计(HPE)的发展大多采用RGB图像作为输入，利用深度卷积神经网络(dnn)估计解剖关键点。然而，这些方法忽视了在夜间或困难的照明条件下可靠地检测特征的挑战，从而导致安全问题。针对这一局限性，本文提出了一种基于开源库OpenPose的RGB/红外相机融合方法，并展示了如何使用从不同图像中提取的关键点融合来提高稀疏光环境下人体姿态估计的性能。具体来说，使用OpenPose从RGB和红外图像中提取人体关节，并通过融合步骤将每帧的贡献组合在一起。我们研究了基于深度神经网络的融合框架的潜力，并将其与线性加权平均方法进行了比较。所提出的方法显示出很好的性能，在光线较差的情况下，即使通过人工检查也很难识别人，在红外和RGB图像的自定义数据集上，最佳结果比传统方法高出1.8倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE Sensors Applications Symposium (SAS)

自引率

0.00%

发文量