Semi-automated dataset creation for semantic and instance segmentation of industrial point clouds.

IF 8.2 1区计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computers in Industry Pub Date : 2023-12-21 DOI:10.1016/j.compind.2023.104064

August Asheim Birkeland , Marius Udnæs

{"title":"Semi-automated dataset creation for semantic and instance segmentation of industrial point clouds.","authors":"August Asheim Birkeland , Marius Udnæs","doi":"10.1016/j.compind.2023.104064","DOIUrl":null,"url":null,"abstract":"<div><p>The current practice for creating as-built geometric Digital Twins (gDTs) of industrial facilities is both labour-intensive and error-prone. In aged industries it typically involves manually crafting a CAD or BIM model from a point cloud collected using terrestrial laser scanners. Recent advances within deep learning (DL) offer the possibility to automate semantic and instance segmentation of point clouds, contributing to a more efficient modelling process. DL networks, however, are data-intensive, requiring large domain-specific datasets. Producing labelled point cloud datasets involves considerable manual labour, and in the industrial domain no open-source instance segmentation dataset exists. We propose a semi-automatic workflow leveraging object descriptions contained in existing gDTs to efficiently create semantic- and instance-labelled point cloud datasets. To prove the efficiency of our workflow, we apply it to two separate areas of a gas processing plant covering a total of <span><math><mrow><mn>40</mn><mspace></mspace><mn>000</mn><mspace></mspace><msup><mrow><mtext>m</mtext></mrow><mrow><mn>2</mn></mrow></msup></mrow></math></span>. We record the effort needed to process one of the areas, labelling a total of 260 million points in 70 h. When benchmarking on a state-of-the-art 3D instance segmentation network, the additional data from the 70-hour effort raises mIoU from 24.4% to 44.4%, AP from 19.7% to 52.5% and RC from 45.9% to 76.7% respectively.</p></div>","PeriodicalId":55219,"journal":{"name":"Computers in Industry","volume":"155 ","pages":"Article 104064"},"PeriodicalIF":8.2000,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0166361523002142/pdfft?md5=866f5e5296cb9cc744004f2c402aba42&pid=1-s2.0-S0166361523002142-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in Industry","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0166361523002142","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

The current practice for creating as-built geometric Digital Twins (gDTs) of industrial facilities is both labour-intensive and error-prone. In aged industries it typically involves manually crafting a CAD or BIM model from a point cloud collected using terrestrial laser scanners. Recent advances within deep learning (DL) offer the possibility to automate semantic and instance segmentation of point clouds, contributing to a more efficient modelling process. DL networks, however, are data-intensive, requiring large domain-specific datasets. Producing labelled point cloud datasets involves considerable manual labour, and in the industrial domain no open-source instance segmentation dataset exists. We propose a semi-automatic workflow leveraging object descriptions contained in existing gDTs to efficiently create semantic- and instance-labelled point cloud datasets. To prove the efficiency of our workflow, we apply it to two separate areas of a gas processing plant covering a total of $40 000 m^{2}$ . We record the effort needed to process one of the areas, labelling a total of 260 million points in 70 h. When benchmarking on a state-of-the-art 3D instance segmentation network, the additional data from the 70-hour effort raises mIoU from 24.4% to 44.4%, AP from 19.7% to 52.5% and RC from 45.9% to 76.7% respectively.

查看原文本刊更多论文

为工业点云的语义和实例分割创建半自动数据集。

目前为工业设施创建竣工几何数字孪生（gDT）的做法既耗费人力，又容易出错。在老旧工业中，通常需要根据使用地面激光扫描仪收集的点云手动制作 CAD 或 BIM 模型。深度学习（DL）的最新进展为自动进行点云语义和实例分割提供了可能，有助于提高建模过程的效率。然而，深度学习网络是数据密集型的，需要大量特定领域的数据集。制作带标签的点云数据集需要大量的手工劳动，而在工业领域还没有开源的实例分割数据集。我们提出了一种半自动工作流程，利用现有 gDT 中包含的对象描述，高效创建语义和实例标签点云数据集。为了证明我们工作流程的效率，我们将其应用于一个天然气处理厂的两个独立区域，总面积达 40000 平方米。我们记录了处理其中一个区域所需的时间，在 70 小时内标注了总计 2.6 亿个点。当以最先进的三维实例分割网络为基准时，70 小时的额外数据将 mIoU 从 24.4% 提高到 44.4%，AP 从 19.7% 提高到 52.5%，RC 从 45.9% 提高到 76.7%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers in Industry 工程技术-计算机：跨学科应用

CiteScore

18.90

自引率

8.00%

发文量

152

审稿时长

22 days

期刊介绍： The objective of Computers in Industry is to present original, high-quality, application-oriented research papers that: • Illuminate emerging trends and possibilities in the utilization of Information and Communication Technology in industry; • Establish connections or integrations across various technology domains within the expansive realm of computer applications for industry; • Foster connections or integrations across diverse application areas of ICT in industry.