You-Hao Ni, Hao Wang, J. Mao, Zhuofan Xi, Zhen-Yi Chen
{"title":"基于全局关注机制和 YOLOv7 网络的典型桥梁表面损伤定量检测","authors":"You-Hao Ni, Hao Wang, J. Mao, Zhuofan Xi, Zhen-Yi Chen","doi":"10.1177/14759217241246953","DOIUrl":null,"url":null,"abstract":"Surface damages of reinforced concrete and steel bridges, for example, crack and corrosion, are usually regarded as indicators of internal structural defects, hence can be used to assess the structural health condition. Quantitative segmentation of these surface damages via computer vision is important yet challenging due to the limited accuracy of traditional semantic segmentation methods. To overcome this challenge, this study proposes a modified semantic segmentation method that can distinguish multiple surface damages, based on you only look once version 7 (YOLOv7) and global attention mechanism (GAM), namely, YOLOv7-SEG-GAM. Initially, the extended efficient layer aggregation network in the backbone network of YOLOv7 was substituted with GAM, followed by the integration of a segmentation head utilizing the three-scale feature map, thus establishing a segmentation network. Subsequently, graphical examples depicting five types of reinforced concrete and steel bridge surface damages, that is, concrete cracks, steel corrosion, exposed rebar, spalling, and efflorescence, are gathered and meticulously labeled to create a semantic segmentation dataset tailored for training the network. Afterwards, a comparative study is undertaken to analyze the effectiveness of GAM, squeeze-and-excitation networks, and convolutional block attention module in enhancing the network’s performance. Ultimately, a calibration device was developed utilizing a laser rangefinder and a smartphone to enable quantitative assessment of bridge damages in real size. Based on the identical dataset, the evaluated accuracy of YOLOv7-SEG-GAM was compared with DeepLabV3+, BiSeNet, and improved semantic segmentation networks. The results indicate that the mean pixel accuracy and mean intersection over union values achieved by YOLOv7-SEG-GAM were 0.881 and 0.782, respectively, surpassing those of DeepLabV3+ and BiSeNet. This study successfully enables pixel-level segmentation of bridge damages and offers valuable insights for quantitative segmentation.","PeriodicalId":515545,"journal":{"name":"Structural Health Monitoring","volume":"14 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Quantitative detection of typical bridge surface damages based on global attention mechanism and YOLOv7 network\",\"authors\":\"You-Hao Ni, Hao Wang, J. Mao, Zhuofan Xi, Zhen-Yi Chen\",\"doi\":\"10.1177/14759217241246953\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Surface damages of reinforced concrete and steel bridges, for example, crack and corrosion, are usually regarded as indicators of internal structural defects, hence can be used to assess the structural health condition. Quantitative segmentation of these surface damages via computer vision is important yet challenging due to the limited accuracy of traditional semantic segmentation methods. To overcome this challenge, this study proposes a modified semantic segmentation method that can distinguish multiple surface damages, based on you only look once version 7 (YOLOv7) and global attention mechanism (GAM), namely, YOLOv7-SEG-GAM. Initially, the extended efficient layer aggregation network in the backbone network of YOLOv7 was substituted with GAM, followed by the integration of a segmentation head utilizing the three-scale feature map, thus establishing a segmentation network. Subsequently, graphical examples depicting five types of reinforced concrete and steel bridge surface damages, that is, concrete cracks, steel corrosion, exposed rebar, spalling, and efflorescence, are gathered and meticulously labeled to create a semantic segmentation dataset tailored for training the network. Afterwards, a comparative study is undertaken to analyze the effectiveness of GAM, squeeze-and-excitation networks, and convolutional block attention module in enhancing the network’s performance. Ultimately, a calibration device was developed utilizing a laser rangefinder and a smartphone to enable quantitative assessment of bridge damages in real size. Based on the identical dataset, the evaluated accuracy of YOLOv7-SEG-GAM was compared with DeepLabV3+, BiSeNet, and improved semantic segmentation networks. The results indicate that the mean pixel accuracy and mean intersection over union values achieved by YOLOv7-SEG-GAM were 0.881 and 0.782, respectively, surpassing those of DeepLabV3+ and BiSeNet. This study successfully enables pixel-level segmentation of bridge damages and offers valuable insights for quantitative segmentation.\",\"PeriodicalId\":515545,\"journal\":{\"name\":\"Structural Health Monitoring\",\"volume\":\"14 4\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Structural Health Monitoring\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1177/14759217241246953\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Structural Health Monitoring","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/14759217241246953","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
钢筋混凝土和钢结构桥梁的表面损伤(如裂缝和腐蚀)通常被视为内部结构缺陷的指标,因此可用于评估结构健康状况。由于传统的语义分割方法精度有限,因此通过计算机视觉对这些表面损伤进行定量分割非常重要,但也极具挑战性。为了克服这一挑战,本研究基于 YOLOv7(You only look once version 7)和全局注意力机制(GAM),提出了一种可区分多种表面损伤的改进型语义分割方法,即 YOLOv7-SEG-GAM。首先,用 GAM 取代 YOLOv7 骨干网络中的扩展高效层聚合网络,然后利用三比例特征图整合分割头,从而建立一个分割网络。随后,收集了描绘五种钢筋混凝土和钢结构桥梁表面损伤(即混凝土裂缝、钢筋锈蚀、钢筋外露、剥落和渗出)的图形实例,并对其进行了细致的标注,从而创建了用于训练网络的语义分割数据集。然后,进行比较研究,分析 GAM、挤压-激发网络和卷积块注意模块在提高网络性能方面的有效性。最后,利用激光测距仪和智能手机开发了一种校准设备,以实现对实际尺寸桥梁损坏情况的定量评估。基于相同的数据集,YOLOv7-SEG-GAM 的评估精度与 DeepLabV3+、BiSeNet 和改进的语义分割网络进行了比较。结果表明,YOLOv7-SEG-GAM 的平均像素准确度和平均交叉值分别为 0.881 和 0.782,超过了 DeepLabV3+ 和 BiSeNet。这项研究成功实现了桥梁损伤的像素级分割,为定量分割提供了宝贵的见解。
Quantitative detection of typical bridge surface damages based on global attention mechanism and YOLOv7 network
Surface damages of reinforced concrete and steel bridges, for example, crack and corrosion, are usually regarded as indicators of internal structural defects, hence can be used to assess the structural health condition. Quantitative segmentation of these surface damages via computer vision is important yet challenging due to the limited accuracy of traditional semantic segmentation methods. To overcome this challenge, this study proposes a modified semantic segmentation method that can distinguish multiple surface damages, based on you only look once version 7 (YOLOv7) and global attention mechanism (GAM), namely, YOLOv7-SEG-GAM. Initially, the extended efficient layer aggregation network in the backbone network of YOLOv7 was substituted with GAM, followed by the integration of a segmentation head utilizing the three-scale feature map, thus establishing a segmentation network. Subsequently, graphical examples depicting five types of reinforced concrete and steel bridge surface damages, that is, concrete cracks, steel corrosion, exposed rebar, spalling, and efflorescence, are gathered and meticulously labeled to create a semantic segmentation dataset tailored for training the network. Afterwards, a comparative study is undertaken to analyze the effectiveness of GAM, squeeze-and-excitation networks, and convolutional block attention module in enhancing the network’s performance. Ultimately, a calibration device was developed utilizing a laser rangefinder and a smartphone to enable quantitative assessment of bridge damages in real size. Based on the identical dataset, the evaluated accuracy of YOLOv7-SEG-GAM was compared with DeepLabV3+, BiSeNet, and improved semantic segmentation networks. The results indicate that the mean pixel accuracy and mean intersection over union values achieved by YOLOv7-SEG-GAM were 0.881 and 0.782, respectively, surpassing those of DeepLabV3+ and BiSeNet. This study successfully enables pixel-level segmentation of bridge damages and offers valuable insights for quantitative segmentation.