{"title":"Pixel-level pavement crack segmentation using UAV remote sensing images based on the ConvNeXt-UPerNet","authors":"Hatem Taha , Hossam El-Habrouk , Wael Bekheet , Sayed El-Naghi , Marwan Torki","doi":"10.1016/j.aej.2025.03.072","DOIUrl":null,"url":null,"abstract":"<div><div>Cracks in the pavement are a common issue affecting transportation infrastructure, requiring timely detection and repair. Computer vision faces challenges in segmenting cracks from images due to complicated topologies, intensity inhomogeneity, poor contrast, and complex backgrounds. Currently, road crack detection relies on manual methods and road detection vehicles, which are inefficient, unsafe, and can cause traffic blockage. Using unmanned aerial vehicles (UAVs) for pavement crack detection could improve efficiency and economic benefits. However, cracks' thin and narrow appearance in UAV remote-sensing images also brings additional challenges and can hinder accurately identifying road cracks. To address these challenges, this paper proposes first, a UAV pavement crack dataset called DronePavSeg dataset. Secondly, the ConvNeXt-UPerNet network is a pixel-level pavement crack segmentation encoder-decoder. This model leverages the exceptional feature extraction capabilities of ConvNeXt as the encoder and UPerNet’s architecture as the decoder to learn the local and global semantic features of pavement cracks, improving segmentation accuracy. The ConvNeXt-Large-UPerNet model outperformed seven other SOTA segmentation models on the DronePavSeg dataset, achieving an average overall performance among the seven splits of mIoU of 79.73 %, Crack-IoU of 61.71 %, and F1-score of 76.03 % with zero-pixel tolerance. Compared to the second-best model (HRNet-FCN), the proposed model showed improvements of 0.41 % in mIoU, 0.74 % in Crack-IoU, and 0.59 % in F1-score. Qualitative examinations also confirmed its effectiveness and robustness in accurate pavement crack detection under complex pavement surfaces. Furthermore, we investigated the effect of different loss functions and various decoder architectures. Finally, the proposed model’s ability to generalize was tested on the two public benchmarks, CFD and Crack500 datasets. The results demonstrated that the ConvNeXt-UPerNet possesses excellent segmentation performance.</div></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":"124 ","pages":"Pages 147-169"},"PeriodicalIF":6.2000,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016825003801","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Cracks in the pavement are a common issue affecting transportation infrastructure, requiring timely detection and repair. Computer vision faces challenges in segmenting cracks from images due to complicated topologies, intensity inhomogeneity, poor contrast, and complex backgrounds. Currently, road crack detection relies on manual methods and road detection vehicles, which are inefficient, unsafe, and can cause traffic blockage. Using unmanned aerial vehicles (UAVs) for pavement crack detection could improve efficiency and economic benefits. However, cracks' thin and narrow appearance in UAV remote-sensing images also brings additional challenges and can hinder accurately identifying road cracks. To address these challenges, this paper proposes first, a UAV pavement crack dataset called DronePavSeg dataset. Secondly, the ConvNeXt-UPerNet network is a pixel-level pavement crack segmentation encoder-decoder. This model leverages the exceptional feature extraction capabilities of ConvNeXt as the encoder and UPerNet’s architecture as the decoder to learn the local and global semantic features of pavement cracks, improving segmentation accuracy. The ConvNeXt-Large-UPerNet model outperformed seven other SOTA segmentation models on the DronePavSeg dataset, achieving an average overall performance among the seven splits of mIoU of 79.73 %, Crack-IoU of 61.71 %, and F1-score of 76.03 % with zero-pixel tolerance. Compared to the second-best model (HRNet-FCN), the proposed model showed improvements of 0.41 % in mIoU, 0.74 % in Crack-IoU, and 0.59 % in F1-score. Qualitative examinations also confirmed its effectiveness and robustness in accurate pavement crack detection under complex pavement surfaces. Furthermore, we investigated the effect of different loss functions and various decoder architectures. Finally, the proposed model’s ability to generalize was tested on the two public benchmarks, CFD and Crack500 datasets. The results demonstrated that the ConvNeXt-UPerNet possesses excellent segmentation performance.
期刊介绍:
Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification:
• Mechanical, Production, Marine and Textile Engineering
• Electrical Engineering, Computer Science and Nuclear Engineering
• Civil and Architecture Engineering
• Chemical Engineering and Applied Sciences
• Environmental Engineering