Testing human-hand segmentation on in-distribution and out-of-distribution data in human–robot interactions using a deep ensemble model

IF 3.1 3区计算机科学 Q2 AUTOMATION & CONTROL SYSTEMS

Mechatronics Pub Date : 2025-06-21 DOI:10.1016/j.mechatronics.2025.103365

Reza Jalayer , Yuxin Chen , Masoud Jalayer , Carlotta Orsenigo , Masayoshi Tomizuka

{"title":"Testing human-hand segmentation on in-distribution and out-of-distribution data in human–robot interactions using a deep ensemble model","authors":"Reza Jalayer , Yuxin Chen , Masoud Jalayer , Carlotta Orsenigo , Masayoshi Tomizuka","doi":"10.1016/j.mechatronics.2025.103365","DOIUrl":null,"url":null,"abstract":"<div><div>Reliable detection and segmentation of human hands are critical for enhancing safety and facilitating advanced interactions in human–robot collaboration. Current research predominantly evaluates hand segmentation under in-distribution (ID) data, which reflects the training data of deep learning (DL) models. However, this approach fails to address out-of-distribution (OOD) scenarios that often arise in real-world human–robot interactions. In this work, we make three key contributions: first we assess the generalization of deep learning (DL) models for hand segmentation under both ID and OOD scenarios, utilizing a newly collected industrial dataset that captures a wide range of real-world conditions including simple and cluttered backgrounds with industrial tools, varying numbers of hands (0 to 4), gloves, rare gestures, and motion blur. Our second contribution is considering both egocentric and static viewpoints. We evaluated the models trained on four datasets, i.e. EgoHands, Ego2Hands (egocentric mobile camera), HADR, and HAGS (static fixed viewpoint) by testing them with both egocentric (head-mounted) and static cameras, enabling robustness evaluation from multiple points of view. Our third contribution is introducing an uncertainty analysis pipeline based on the predictive entropy of predicted hand pixels. This procedure enables flagging unreliable segmentation outputs by applying thresholds established in the validation phase. This enables automatic identification and filtering of untrustworthy predictions, significantly improving segmentation reliability in OOD scenarios. For segmentation, we used a deep ensemble model composed of UNet and RefineNet as base learners. Our experiments demonstrate that models trained on industrial datasets (HADR, HAGS) outperform those trained on non-industrial datasets, both in segmentation accuracy and in their ability to flag unreliable outputs via uncertainty estimation. These findings underscore the necessity of domain-specific training data and show that our uncertainty analysis pipeline can provide a practical safety layer for real-world deployment.</div></div>","PeriodicalId":49842,"journal":{"name":"Mechatronics","volume":"110 ","pages":"Article 103365"},"PeriodicalIF":3.1000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mechatronics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957415825000741","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Reliable detection and segmentation of human hands are critical for enhancing safety and facilitating advanced interactions in human–robot collaboration. Current research predominantly evaluates hand segmentation under in-distribution (ID) data, which reflects the training data of deep learning (DL) models. However, this approach fails to address out-of-distribution (OOD) scenarios that often arise in real-world human–robot interactions. In this work, we make three key contributions: first we assess the generalization of deep learning (DL) models for hand segmentation under both ID and OOD scenarios, utilizing a newly collected industrial dataset that captures a wide range of real-world conditions including simple and cluttered backgrounds with industrial tools, varying numbers of hands (0 to 4), gloves, rare gestures, and motion blur. Our second contribution is considering both egocentric and static viewpoints. We evaluated the models trained on four datasets, i.e. EgoHands, Ego2Hands (egocentric mobile camera), HADR, and HAGS (static fixed viewpoint) by testing them with both egocentric (head-mounted) and static cameras, enabling robustness evaluation from multiple points of view. Our third contribution is introducing an uncertainty analysis pipeline based on the predictive entropy of predicted hand pixels. This procedure enables flagging unreliable segmentation outputs by applying thresholds established in the validation phase. This enables automatic identification and filtering of untrustworthy predictions, significantly improving segmentation reliability in OOD scenarios. For segmentation, we used a deep ensemble model composed of UNet and RefineNet as base learners. Our experiments demonstrate that models trained on industrial datasets (HADR, HAGS) outperform those trained on non-industrial datasets, both in segmentation accuracy and in their ability to flag unreliable outputs via uncertainty estimation. These findings underscore the necessity of domain-specific training data and show that our uncertainty analysis pipeline can provide a practical safety layer for real-world deployment.

查看原文本刊更多论文

使用深度集成模型测试人机交互中分布内和分布外数据的人手分割

在人机协作中，手部的可靠检测和分割对于提高安全性和促进高级交互至关重要。目前的研究主要是对in-distribution （ID）数据下的手部分割进行评估，这反映了深度学习（DL）模型的训练数据。然而，这种方法无法解决在现实世界人机交互中经常出现的分布外（OOD）场景。在这项工作中，我们做出了三个关键贡献：首先，我们利用新收集的工业数据集评估了ID和OOD场景下手部分割的深度学习（DL）模型的泛化，该数据集捕获了广泛的现实世界条件，包括简单和混乱的背景，工业工具，不同数量的手（0到4），手套，罕见手势和运动模糊。我们的第二个贡献是考虑了自我中心和静态观点。我们评估了四个数据集上训练的模型，即EgoHands， Ego2Hands（以自我为中心的移动相机），HADR和HAGS（静态固定视点），通过以自我为中心（头戴式）和静态相机进行测试，从而从多个角度进行鲁棒性评估。我们的第三个贡献是引入了一个基于预测手部像素的预测熵的不确定性分析管道。此过程通过应用验证阶段中建立的阈值来标记不可靠的分割输出。这可以自动识别和过滤不可信的预测，显著提高OOD场景中的分割可靠性。对于分割，我们使用由UNet和RefineNet组成的深度集成模型作为基础学习器。我们的实验表明，在工业数据集（HADR， HAGS）上训练的模型在分割精度和通过不确定性估计标记不可靠输出的能力方面优于在非工业数据集上训练的模型。这些发现强调了特定领域训练数据的必要性，并表明我们的不确定性分析管道可以为实际部署提供实用的安全层。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Mechatronics 工程技术-工程：电子与电气

CiteScore

5.90

自引率

9.10%

发文量

审稿时长

109 days

期刊介绍： Mechatronics is the synergistic combination of precision mechanical engineering, electronic control and systems thinking in the design of products and manufacturing processes. It relates to the design of systems, devices and products aimed at achieving an optimal balance between basic mechanical structure and its overall control. The purpose of this journal is to provide rapid publication of topical papers featuring practical developments in mechatronics. It will cover a wide range of application areas including consumer product design, instrumentation, manufacturing methods, computer integration and process and device control, and will attract a readership from across the industrial and academic research spectrum. Particular importance will be attached to aspects of innovation in mechatronics design philosophy which illustrate the benefits obtainable by an a priori integration of functionality with embedded microprocessor control. A major item will be the design of machines, devices and systems possessing a degree of computer based intelligence. The journal seeks to publish research progress in this field with an emphasis on the applied rather than the theoretical. It will also serve the dual role of bringing greater recognition to this important area of engineering.