Semi-automated dataset creation for semantic and instance segmentation of industrial point clouds.

IF 8.2 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
August Asheim Birkeland , Marius Udnæs
{"title":"Semi-automated dataset creation for semantic and instance segmentation of industrial point clouds.","authors":"August Asheim Birkeland ,&nbsp;Marius Udnæs","doi":"10.1016/j.compind.2023.104064","DOIUrl":null,"url":null,"abstract":"<div><p>The current practice for creating as-built geometric Digital Twins (gDTs) of industrial facilities is both labour-intensive and error-prone. In aged industries it typically involves manually crafting a CAD or BIM model from a point cloud collected using terrestrial laser scanners. Recent advances within deep learning (DL) offer the possibility to automate semantic and instance segmentation of point clouds, contributing to a more efficient modelling process. DL networks, however, are data-intensive, requiring large domain-specific datasets. Producing labelled point cloud datasets involves considerable manual labour, and in the industrial domain no open-source instance segmentation dataset exists. We propose a semi-automatic workflow leveraging object descriptions contained in existing gDTs to efficiently create semantic- and instance-labelled point cloud datasets. To prove the efficiency of our workflow, we apply it to two separate areas of a gas processing plant covering a total of <span><math><mrow><mn>40</mn><mspace></mspace><mn>000</mn><mspace></mspace><msup><mrow><mtext>m</mtext></mrow><mrow><mn>2</mn></mrow></msup></mrow></math></span>. We record the effort needed to process one of the areas, labelling a total of 260 million points in 70 h. When benchmarking on a state-of-the-art 3D instance segmentation network, the additional data from the 70-hour effort raises mIoU from 24.4% to 44.4%, AP from 19.7% to 52.5% and RC from 45.9% to 76.7% respectively.</p></div>","PeriodicalId":55219,"journal":{"name":"Computers in Industry","volume":null,"pages":null},"PeriodicalIF":8.2000,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0166361523002142/pdfft?md5=866f5e5296cb9cc744004f2c402aba42&pid=1-s2.0-S0166361523002142-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in Industry","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0166361523002142","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

The current practice for creating as-built geometric Digital Twins (gDTs) of industrial facilities is both labour-intensive and error-prone. In aged industries it typically involves manually crafting a CAD or BIM model from a point cloud collected using terrestrial laser scanners. Recent advances within deep learning (DL) offer the possibility to automate semantic and instance segmentation of point clouds, contributing to a more efficient modelling process. DL networks, however, are data-intensive, requiring large domain-specific datasets. Producing labelled point cloud datasets involves considerable manual labour, and in the industrial domain no open-source instance segmentation dataset exists. We propose a semi-automatic workflow leveraging object descriptions contained in existing gDTs to efficiently create semantic- and instance-labelled point cloud datasets. To prove the efficiency of our workflow, we apply it to two separate areas of a gas processing plant covering a total of 40000m2. We record the effort needed to process one of the areas, labelling a total of 260 million points in 70 h. When benchmarking on a state-of-the-art 3D instance segmentation network, the additional data from the 70-hour effort raises mIoU from 24.4% to 44.4%, AP from 19.7% to 52.5% and RC from 45.9% to 76.7% respectively.

为工业点云的语义和实例分割创建半自动数据集。
目前为工业设施创建竣工几何数字孪生(gDT)的做法既耗费人力,又容易出错。在老旧工业中,通常需要根据使用地面激光扫描仪收集的点云手动制作 CAD 或 BIM 模型。深度学习(DL)的最新进展为自动进行点云语义和实例分割提供了可能,有助于提高建模过程的效率。然而,深度学习网络是数据密集型的,需要大量特定领域的数据集。制作带标签的点云数据集需要大量的手工劳动,而在工业领域还没有开源的实例分割数据集。我们提出了一种半自动工作流程,利用现有 gDT 中包含的对象描述,高效创建语义和实例标签点云数据集。为了证明我们工作流程的效率,我们将其应用于一个天然气处理厂的两个独立区域,总面积达 40000 平方米。我们记录了处理其中一个区域所需的时间,在 70 小时内标注了总计 2.6 亿个点。当以最先进的三维实例分割网络为基准时,70 小时的额外数据将 mIoU 从 24.4% 提高到 44.4%,AP 从 19.7% 提高到 52.5%,RC 从 45.9% 提高到 76.7%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computers in Industry
Computers in Industry 工程技术-计算机:跨学科应用
CiteScore
18.90
自引率
8.00%
发文量
152
审稿时长
22 days
期刊介绍: The objective of Computers in Industry is to present original, high-quality, application-oriented research papers that: • Illuminate emerging trends and possibilities in the utilization of Information and Communication Technology in industry; • Establish connections or integrations across various technology domains within the expansive realm of computer applications for industry; • Foster connections or integrations across diverse application areas of ICT in industry.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信