Mingyu Zhang, Lei Wang, Shuai Han, Shuyuan Wang, Heng Li
{"title":"使用局部稀疏变换器的深度学习框架,利用激光雷达进行三维建筑工人检测","authors":"Mingyu Zhang, Lei Wang, Shuai Han, Shuyuan Wang, Heng Li","doi":"10.1111/mice.13238","DOIUrl":null,"url":null,"abstract":"Autonomous equipment is playing an increasingly important role in construction tasks. It is essential to equip autonomous equipment with powerful 3D detection capability to avoid accidents and inefficiency. However, there is limited research within the construction field that has extended detection to 3D. To this end, this study develops a light detection and ranging (LiDAR)-based deep-learning model for the 3D detection of workers on construction sites. The proposed model adopts a voxel-based anchor-free 3D object detection paradigm. To enhance the feature extraction capability for tough detection tasks, a novel Transformer-based block is proposed, where the multi-head self-attention is applied in local grid regions. The detection model integrates the Transformer blocks with 3D sparse convolution to extract wide and local features while pruning redundant features in modified downsampling layers. To train and test the proposed model, a LiDAR point cloud dataset was created, which includes workers in construction sites with 3D box annotations. The experiment results indicate that the proposed model outperforms the baseline models with higher mean average precision and smaller regression errors. The method in the study is promising to provide worker detection with rich and accurate 3D information required by construction automation.","PeriodicalId":156,"journal":{"name":"Computer-Aided Civil and Infrastructure Engineering","volume":null,"pages":null},"PeriodicalIF":8.5000,"publicationDate":"2024-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep learning framework with Local Sparse Transformer for construction worker detection in 3D with LiDAR\",\"authors\":\"Mingyu Zhang, Lei Wang, Shuai Han, Shuyuan Wang, Heng Li\",\"doi\":\"10.1111/mice.13238\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Autonomous equipment is playing an increasingly important role in construction tasks. It is essential to equip autonomous equipment with powerful 3D detection capability to avoid accidents and inefficiency. However, there is limited research within the construction field that has extended detection to 3D. To this end, this study develops a light detection and ranging (LiDAR)-based deep-learning model for the 3D detection of workers on construction sites. The proposed model adopts a voxel-based anchor-free 3D object detection paradigm. To enhance the feature extraction capability for tough detection tasks, a novel Transformer-based block is proposed, where the multi-head self-attention is applied in local grid regions. The detection model integrates the Transformer blocks with 3D sparse convolution to extract wide and local features while pruning redundant features in modified downsampling layers. To train and test the proposed model, a LiDAR point cloud dataset was created, which includes workers in construction sites with 3D box annotations. The experiment results indicate that the proposed model outperforms the baseline models with higher mean average precision and smaller regression errors. The method in the study is promising to provide worker detection with rich and accurate 3D information required by construction automation.\",\"PeriodicalId\":156,\"journal\":{\"name\":\"Computer-Aided Civil and Infrastructure Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":8.5000,\"publicationDate\":\"2024-05-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer-Aided Civil and Infrastructure Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1111/mice.13238\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer-Aided Civil and Infrastructure Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1111/mice.13238","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Deep learning framework with Local Sparse Transformer for construction worker detection in 3D with LiDAR
Autonomous equipment is playing an increasingly important role in construction tasks. It is essential to equip autonomous equipment with powerful 3D detection capability to avoid accidents and inefficiency. However, there is limited research within the construction field that has extended detection to 3D. To this end, this study develops a light detection and ranging (LiDAR)-based deep-learning model for the 3D detection of workers on construction sites. The proposed model adopts a voxel-based anchor-free 3D object detection paradigm. To enhance the feature extraction capability for tough detection tasks, a novel Transformer-based block is proposed, where the multi-head self-attention is applied in local grid regions. The detection model integrates the Transformer blocks with 3D sparse convolution to extract wide and local features while pruning redundant features in modified downsampling layers. To train and test the proposed model, a LiDAR point cloud dataset was created, which includes workers in construction sites with 3D box annotations. The experiment results indicate that the proposed model outperforms the baseline models with higher mean average precision and smaller regression errors. The method in the study is promising to provide worker detection with rich and accurate 3D information required by construction automation.
期刊介绍:
Computer-Aided Civil and Infrastructure Engineering stands as a scholarly, peer-reviewed archival journal, serving as a vital link between advancements in computer technology and civil and infrastructure engineering. The journal serves as a distinctive platform for the publication of original articles, spotlighting novel computational techniques and inventive applications of computers. Specifically, it concentrates on recent progress in computer and information technologies, fostering the development and application of emerging computing paradigms.
Encompassing a broad scope, the journal addresses bridge, construction, environmental, highway, geotechnical, structural, transportation, and water resources engineering. It extends its reach to the management of infrastructure systems, covering domains such as highways, bridges, pavements, airports, and utilities. The journal delves into areas like artificial intelligence, cognitive modeling, concurrent engineering, database management, distributed computing, evolutionary computing, fuzzy logic, genetic algorithms, geometric modeling, internet-based technologies, knowledge discovery and engineering, machine learning, mobile computing, multimedia technologies, networking, neural network computing, optimization and search, parallel processing, robotics, smart structures, software engineering, virtual reality, and visualization techniques.