基于ConvNeXt-UPerNet的无人机遥感图像像素级路面裂缝分割

IF 6.2 2区工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY

alexandria engineering journal Pub Date : 2025-03-31 DOI:10.1016/j.aej.2025.03.072

Hatem Taha , Hossam El-Habrouk , Wael Bekheet , Sayed El-Naghi , Marwan Torki

{"title":"基于ConvNeXt-UPerNet的无人机遥感图像像素级路面裂缝分割","authors":"Hatem Taha , Hossam El-Habrouk , Wael Bekheet , Sayed El-Naghi , Marwan Torki","doi":"10.1016/j.aej.2025.03.072","DOIUrl":null,"url":null,"abstract":"<div><div>Cracks in the pavement are a common issue affecting transportation infrastructure, requiring timely detection and repair. Computer vision faces challenges in segmenting cracks from images due to complicated topologies, intensity inhomogeneity, poor contrast, and complex backgrounds. Currently, road crack detection relies on manual methods and road detection vehicles, which are inefficient, unsafe, and can cause traffic blockage. Using unmanned aerial vehicles (UAVs) for pavement crack detection could improve efficiency and economic benefits. However, cracks' thin and narrow appearance in UAV remote-sensing images also brings additional challenges and can hinder accurately identifying road cracks. To address these challenges, this paper proposes first, a UAV pavement crack dataset called DronePavSeg dataset. Secondly, the ConvNeXt-UPerNet network is a pixel-level pavement crack segmentation encoder-decoder. This model leverages the exceptional feature extraction capabilities of ConvNeXt as the encoder and UPerNet’s architecture as the decoder to learn the local and global semantic features of pavement cracks, improving segmentation accuracy. The ConvNeXt-Large-UPerNet model outperformed seven other SOTA segmentation models on the DronePavSeg dataset, achieving an average overall performance among the seven splits of mIoU of 79.73 %, Crack-IoU of 61.71 %, and F1-score of 76.03 % with zero-pixel tolerance. Compared to the second-best model (HRNet-FCN), the proposed model showed improvements of 0.41 % in mIoU, 0.74 % in Crack-IoU, and 0.59 % in F1-score. Qualitative examinations also confirmed its effectiveness and robustness in accurate pavement crack detection under complex pavement surfaces. Furthermore, we investigated the effect of different loss functions and various decoder architectures. Finally, the proposed model’s ability to generalize was tested on the two public benchmarks, CFD and Crack500 datasets. The results demonstrated that the ConvNeXt-UPerNet possesses excellent segmentation performance.</div></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":"124 ","pages":"Pages 147-169"},"PeriodicalIF":6.2000,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Pixel-level pavement crack segmentation using UAV remote sensing images based on the ConvNeXt-UPerNet\",\"authors\":\"Hatem Taha , Hossam El-Habrouk , Wael Bekheet , Sayed El-Naghi , Marwan Torki\",\"doi\":\"10.1016/j.aej.2025.03.072\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Cracks in the pavement are a common issue affecting transportation infrastructure, requiring timely detection and repair. Computer vision faces challenges in segmenting cracks from images due to complicated topologies, intensity inhomogeneity, poor contrast, and complex backgrounds. Currently, road crack detection relies on manual methods and road detection vehicles, which are inefficient, unsafe, and can cause traffic blockage. Using unmanned aerial vehicles (UAVs) for pavement crack detection could improve efficiency and economic benefits. However, cracks' thin and narrow appearance in UAV remote-sensing images also brings additional challenges and can hinder accurately identifying road cracks. To address these challenges, this paper proposes first, a UAV pavement crack dataset called DronePavSeg dataset. Secondly, the ConvNeXt-UPerNet network is a pixel-level pavement crack segmentation encoder-decoder. This model leverages the exceptional feature extraction capabilities of ConvNeXt as the encoder and UPerNet’s architecture as the decoder to learn the local and global semantic features of pavement cracks, improving segmentation accuracy. The ConvNeXt-Large-UPerNet model outperformed seven other SOTA segmentation models on the DronePavSeg dataset, achieving an average overall performance among the seven splits of mIoU of 79.73 %, Crack-IoU of 61.71 %, and F1-score of 76.03 % with zero-pixel tolerance. Compared to the second-best model (HRNet-FCN), the proposed model showed improvements of 0.41 % in mIoU, 0.74 % in Crack-IoU, and 0.59 % in F1-score. Qualitative examinations also confirmed its effectiveness and robustness in accurate pavement crack detection under complex pavement surfaces. Furthermore, we investigated the effect of different loss functions and various decoder architectures. Finally, the proposed model’s ability to generalize was tested on the two public benchmarks, CFD and Crack500 datasets. The results demonstrated that the ConvNeXt-UPerNet possesses excellent segmentation performance.</div></div>\",\"PeriodicalId\":7484,\"journal\":{\"name\":\"alexandria engineering journal\",\"volume\":\"124 \",\"pages\":\"Pages 147-169\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-03-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"alexandria engineering journal\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1110016825003801\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016825003801","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

路面裂缝是影响交通基础设施的常见问题，需要及时检测和修复。由于复杂的拓扑结构、强度不均匀性、对比度差和复杂的背景，计算机视觉在从图像中分割裂缝方面面临着挑战。目前，道路裂缝检测主要依靠人工方法和道路检测车辆，效率低、不安全，还会造成交通堵塞。利用无人机进行路面裂缝检测可以提高检测效率和经济效益。然而，裂缝在无人机遥感图像中的薄而窄的外观也带来了额外的挑战，并可能阻碍准确识别道路裂缝。为了应对这些挑战，本文首先提出了一个名为DronePavSeg的无人机路面裂缝数据集。其次，ConvNeXt-UPerNet网络是一个像素级路面裂缝分割编解码器。该模型利用ConvNeXt作为编码器的特殊特征提取能力和upnet的架构作为解码器来学习路面裂缝的局部和全局语义特征，提高分割精度。在DronePavSeg数据集上，ConvNeXt-Large-UPerNet模型优于其他7种SOTA分割模型，mIoU分割的平均整体性能为79.73 %，Crack-IoU分割的平均整体性能为61.71 %，F1-score分割的平均整体性能为76.03 %，且像素差为零。与第二优模型（HRNet-FCN）相比，该模型的mIoU改善了0.41 %，Crack-IoU改善了0.74 %，F1-score改善了0.59 %。定性检验也证实了该方法在复杂路面条件下准确检测路面裂缝的有效性和鲁棒性。此外，我们还研究了不同损失函数和不同解码器结构的影响。最后，在CFD和Crack500两个公共基准数据集上测试了所提出模型的泛化能力。结果表明，该算法具有良好的分割性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Pixel-level pavement crack segmentation using UAV remote sensing images based on the ConvNeXt-UPerNet

Cracks in the pavement are a common issue affecting transportation infrastructure, requiring timely detection and repair. Computer vision faces challenges in segmenting cracks from images due to complicated topologies, intensity inhomogeneity, poor contrast, and complex backgrounds. Currently, road crack detection relies on manual methods and road detection vehicles, which are inefficient, unsafe, and can cause traffic blockage. Using unmanned aerial vehicles (UAVs) for pavement crack detection could improve efficiency and economic benefits. However, cracks' thin and narrow appearance in UAV remote-sensing images also brings additional challenges and can hinder accurately identifying road cracks. To address these challenges, this paper proposes first, a UAV pavement crack dataset called DronePavSeg dataset. Secondly, the ConvNeXt-UPerNet network is a pixel-level pavement crack segmentation encoder-decoder. This model leverages the exceptional feature extraction capabilities of ConvNeXt as the encoder and UPerNet’s architecture as the decoder to learn the local and global semantic features of pavement cracks, improving segmentation accuracy. The ConvNeXt-Large-UPerNet model outperformed seven other SOTA segmentation models on the DronePavSeg dataset, achieving an average overall performance among the seven splits of mIoU of 79.73 %, Crack-IoU of 61.71 %, and F1-score of 76.03 % with zero-pixel tolerance. Compared to the second-best model (HRNet-FCN), the proposed model showed improvements of 0.41 % in mIoU, 0.74 % in Crack-IoU, and 0.59 % in F1-score. Qualitative examinations also confirmed its effectiveness and robustness in accurate pavement crack detection under complex pavement surfaces. Furthermore, we investigated the effect of different loss functions and various decoder architectures. Finally, the proposed model’s ability to generalize was tested on the two public benchmarks, CFD and Crack500 datasets. The results demonstrated that the ConvNeXt-UPerNet possesses excellent segmentation performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

alexandria engineering journal Engineering-General Engineering

CiteScore

11.20

自引率

4.40%

发文量

1015

审稿时长

43 days

期刊介绍： Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification: • Mechanical, Production, Marine and Textile Engineering • Electrical Engineering, Computer Science and Nuclear Engineering • Civil and Architecture Engineering • Chemical Engineering and Applied Sciences • Environmental Engineering