从可转移的情境知识中学习三维人-物互动图，用于建筑监测

IF 8.2 1区计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computers in Industry Pub Date : 2024-09-10 DOI:10.1016/j.compind.2024.104171

Liuyue Xie, Shreyas Misra, Nischal Suresh, Justin Soza-Soto, Tomotake Furuhata, Kenji Shimada

{"title":"从可转移的情境知识中学习三维人-物互动图，用于建筑监测","authors":"Liuyue Xie, Shreyas Misra, Nischal Suresh, Justin Soza-Soto, Tomotake Furuhata, Kenji Shimada","doi":"10.1016/j.compind.2024.104171","DOIUrl":null,"url":null,"abstract":"<div><p>We propose a novel framework for detecting 3D human–object interactions (HOI) in construction sites and a toolkit for generating construction-related human–object interaction graphs. Computer vision methods have been adopted for construction site safety surveillance in recent years. The current computer vision methods rely on videos and images, with which safety verification is performed on common-sense knowledge, without considering 3D spatial relationships among the detected instances. We propose a new method to incorporate spatial understanding by directly inferring the interactions from 3D point cloud data. The proposed model is trained on a 3D construction site dataset generated from our crafted simulation toolkit. The model achieves 54.11% mean interaction over union (mIOU) and 72.98% average mean precision(mAP) for the worker–object interaction relationship recognition. The model is also validated on PiGraphs, a benchmarking dataset with 3D human–object interaction types, and compared against other existing 3D interaction detection frameworks. It was observed that it achieves superior performance from the state-of-the-art model, increasing the interaction detection mAP by 17.01%. Besides the 3D interaction model, we also simulate interactions from industrial surveillance footage using MoCap and physical constraints, which will be released to foster future studies in the domain.</p></div>","PeriodicalId":55219,"journal":{"name":"Computers in Industry","volume":"164 ","pages":"Article 104171"},"PeriodicalIF":8.2000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S016636152400099X/pdfft?md5=5de4190059c557871f94dcddc09652d4&pid=1-s2.0-S016636152400099X-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Learning 3D human–object interaction graphs from transferable context knowledge for construction monitoring\",\"authors\":\"Liuyue Xie, Shreyas Misra, Nischal Suresh, Justin Soza-Soto, Tomotake Furuhata, Kenji Shimada\",\"doi\":\"10.1016/j.compind.2024.104171\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>We propose a novel framework for detecting 3D human–object interactions (HOI) in construction sites and a toolkit for generating construction-related human–object interaction graphs. Computer vision methods have been adopted for construction site safety surveillance in recent years. The current computer vision methods rely on videos and images, with which safety verification is performed on common-sense knowledge, without considering 3D spatial relationships among the detected instances. We propose a new method to incorporate spatial understanding by directly inferring the interactions from 3D point cloud data. The proposed model is trained on a 3D construction site dataset generated from our crafted simulation toolkit. The model achieves 54.11% mean interaction over union (mIOU) and 72.98% average mean precision(mAP) for the worker–object interaction relationship recognition. The model is also validated on PiGraphs, a benchmarking dataset with 3D human–object interaction types, and compared against other existing 3D interaction detection frameworks. It was observed that it achieves superior performance from the state-of-the-art model, increasing the interaction detection mAP by 17.01%. Besides the 3D interaction model, we also simulate interactions from industrial surveillance footage using MoCap and physical constraints, which will be released to foster future studies in the domain.</p></div>\",\"PeriodicalId\":55219,\"journal\":{\"name\":\"Computers in Industry\",\"volume\":\"164 \",\"pages\":\"Article 104171\"},\"PeriodicalIF\":8.2000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S016636152400099X/pdfft?md5=5de4190059c557871f94dcddc09652d4&pid=1-s2.0-S016636152400099X-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in Industry\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S016636152400099X\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in Industry","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S016636152400099X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

我们提出了一个用于检测建筑工地三维人-物互动（HOI）的新框架，以及一个用于生成建筑相关人-物互动图的工具包。近年来，建筑工地安全监控一直采用计算机视觉方法。目前的计算机视觉方法依赖于视频和图像，其安全验证是根据常识进行的，没有考虑检测到的实例之间的三维空间关系。我们提出了一种新方法，通过直接推断三维点云数据中的交互关系来纳入空间理解。我们在手工制作的模拟工具包生成的三维建筑工地数据集上对所提出的模型进行了训练。在工人与物体的交互关系识别方面，该模型实现了 54.11% 的平均交互超过联合（mIOU）和 72.98% 的平均精确度（mAP）。该模型还在具有三维人-物交互类型的基准数据集 PiGraphs 上进行了验证，并与其他现有的三维交互检测框架进行了比较。结果表明，该模型的性能优于最先进的模型，交互检测 mAP 提高了 17.01%。除三维交互模型外，我们还利用 MoCap 和物理约束模拟了工业监控录像中的交互，这些数据将用于促进该领域的未来研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning 3D human–object interaction graphs from transferable context knowledge for construction monitoring

We propose a novel framework for detecting 3D human–object interactions (HOI) in construction sites and a toolkit for generating construction-related human–object interaction graphs. Computer vision methods have been adopted for construction site safety surveillance in recent years. The current computer vision methods rely on videos and images, with which safety verification is performed on common-sense knowledge, without considering 3D spatial relationships among the detected instances. We propose a new method to incorporate spatial understanding by directly inferring the interactions from 3D point cloud data. The proposed model is trained on a 3D construction site dataset generated from our crafted simulation toolkit. The model achieves 54.11% mean interaction over union (mIOU) and 72.98% average mean precision(mAP) for the worker–object interaction relationship recognition. The model is also validated on PiGraphs, a benchmarking dataset with 3D human–object interaction types, and compared against other existing 3D interaction detection frameworks. It was observed that it achieves superior performance from the state-of-the-art model, increasing the interaction detection mAP by 17.01%. Besides the 3D interaction model, we also simulate interactions from industrial surveillance footage using MoCap and physical constraints, which will be released to foster future studies in the domain.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers in Industry 工程技术-计算机：跨学科应用

CiteScore

18.90

自引率

8.00%

发文量

152

审稿时长

22 days

期刊介绍： The objective of Computers in Industry is to present original, high-quality, application-oriented research papers that: • Illuminate emerging trends and possibilities in the utilization of Information and Communication Technology in industry; • Establish connections or integrations across various technology domains within the expansive realm of computer applications for industry; • Foster connections or integrations across diverse application areas of ICT in industry.