基于高效视觉变压器增强无锚YOLO的混凝土桥梁损伤自动检测

IF 11.6 1区 工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY
Xiaofei Yang , Enrique del Rey Castillo , Yang Zou , Liam Wotherspoon , Jianxi Yang , Hao Li
{"title":"基于高效视觉变压器增强无锚YOLO的混凝土桥梁损伤自动检测","authors":"Xiaofei Yang ,&nbsp;Enrique del Rey Castillo ,&nbsp;Yang Zou ,&nbsp;Liam Wotherspoon ,&nbsp;Jianxi Yang ,&nbsp;Hao Li","doi":"10.1016/j.eng.2025.02.018","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning techniques have recently been the most popular method for automatically detecting bridge damage captured by unmanned aerial vehicles (UAVs). However, their wider application to real-world scenarios is hindered by three challenges: ① defect scale variance, motion blur, and strong illumination significantly affect the accuracy and reliability of damage detectors; ② existing commonly used anchor-based damage detectors struggle to effectively generalize to harsh real-world scenarios; and ③ convolutional neural networks (CNNs) lack the capability to model long-range dependencies across the entire image. This paper presents an efficient Vision Transformer-enhanced anchor-free YOLO (you only look once) method to address these challenges. First, a concrete bridge damage dataset was established, augmented by motion blur and varying brightness. Four key enhancements were then applied to an anchor-based YOLO method: ① Four detection heads were introduced to alleviate the multi-scale damage detection issue; ② decoupled heads were employed to address the conflict between classification and bounding box regression tasks inherent in the original coupled head design; ③ an anchor-free mechanism was incorporated to reduce the computational complexity and improve generalization to real-world scenarios; and ④ a novel Vision Transformer block, C3MaxViT, was added to enable CNNs to model long-range dependencies. These enhancements were integrated into an advanced anchor-based YOLOv5l algorithm, and the proposed Vision Transformer-enhanced anchor-free YOLO method was then compared against cutting-edge damage detection methods. The experimental results demonstrated the effectiveness of the proposed method, with an increase of 8.1% in mean average precision at intersection over union threshold of 0.5 (mAP<sub>50</sub>) and an improvement of 8.4% in mAP@[0.5:.05:.95] respectively. Furthermore, extensive ablation studies revealed that the four detection heads, decoupled head design, anchor-free mechanism, and C3MaxViT contributed improvements of 2.4%, 1.2%, 2.6%, and 1.9% in mAP<sub>50</sub>, respectively.</div></div>","PeriodicalId":11783,"journal":{"name":"Engineering","volume":"51 ","pages":"Pages 311-326"},"PeriodicalIF":11.6000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated Concrete Bridge Damage Detection Using an Efficient Vision Transformer-Enhanced Anchor-Free YOLO\",\"authors\":\"Xiaofei Yang ,&nbsp;Enrique del Rey Castillo ,&nbsp;Yang Zou ,&nbsp;Liam Wotherspoon ,&nbsp;Jianxi Yang ,&nbsp;Hao Li\",\"doi\":\"10.1016/j.eng.2025.02.018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Deep learning techniques have recently been the most popular method for automatically detecting bridge damage captured by unmanned aerial vehicles (UAVs). However, their wider application to real-world scenarios is hindered by three challenges: ① defect scale variance, motion blur, and strong illumination significantly affect the accuracy and reliability of damage detectors; ② existing commonly used anchor-based damage detectors struggle to effectively generalize to harsh real-world scenarios; and ③ convolutional neural networks (CNNs) lack the capability to model long-range dependencies across the entire image. This paper presents an efficient Vision Transformer-enhanced anchor-free YOLO (you only look once) method to address these challenges. First, a concrete bridge damage dataset was established, augmented by motion blur and varying brightness. Four key enhancements were then applied to an anchor-based YOLO method: ① Four detection heads were introduced to alleviate the multi-scale damage detection issue; ② decoupled heads were employed to address the conflict between classification and bounding box regression tasks inherent in the original coupled head design; ③ an anchor-free mechanism was incorporated to reduce the computational complexity and improve generalization to real-world scenarios; and ④ a novel Vision Transformer block, C3MaxViT, was added to enable CNNs to model long-range dependencies. These enhancements were integrated into an advanced anchor-based YOLOv5l algorithm, and the proposed Vision Transformer-enhanced anchor-free YOLO method was then compared against cutting-edge damage detection methods. The experimental results demonstrated the effectiveness of the proposed method, with an increase of 8.1% in mean average precision at intersection over union threshold of 0.5 (mAP<sub>50</sub>) and an improvement of 8.4% in mAP@[0.5:.05:.95] respectively. Furthermore, extensive ablation studies revealed that the four detection heads, decoupled head design, anchor-free mechanism, and C3MaxViT contributed improvements of 2.4%, 1.2%, 2.6%, and 1.9% in mAP<sub>50</sub>, respectively.</div></div>\",\"PeriodicalId\":11783,\"journal\":{\"name\":\"Engineering\",\"volume\":\"51 \",\"pages\":\"Pages 311-326\"},\"PeriodicalIF\":11.6000,\"publicationDate\":\"2025-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2095809925001523\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2095809925001523","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

最近,深度学习技术已成为无人驾驶飞行器(uav)自动检测桥梁损伤的最流行方法。然而,它们在现实场景中的广泛应用受到以下三个挑战的阻碍:①缺陷尺度差异、运动模糊和强照明严重影响了损伤检测器的准确性和可靠性;②现有常用的基于锚点的损伤检测器难以有效地推广到恶劣的现实世界场景;③卷积神经网络(cnn)缺乏对整个图像的长期依赖关系进行建模的能力。本文提出了一种有效的视觉转换器增强的无锚YOLO(你只看一次)方法来解决这些挑战。首先,建立混凝土桥梁损伤数据集,通过运动模糊和变亮度增强数据集;然后,对基于锚点的YOLO方法进行了四个关键改进:①引入了四个检测头,以缓解多尺度损伤检测问题;②采用解耦头,解决了原耦合头设计中分类任务与边界盒回归任务之间的冲突;③引入无锚机制,降低计算复杂度,提高对现实场景的泛化能力;④添加了一种新的视觉变压器模块C3MaxViT,使cnn能够对远程依赖关系进行建模。将这些增强功能集成到先进的基于锚点的YOLOv5l算法中,然后将提出的视觉变压器增强无锚点YOLO方法与先进的损伤检测方法进行比较。实验结果证明了该方法的有效性,交叉口超过联合阈值0.5 (mAP50)的平均精度提高了8.1%,mAP@[0.5:.05]的平均精度提高了8.4%。分别为95)。此外,广泛的消融研究表明,四个检测头、解耦头设计、无锚定机制和C3MaxViT分别对mAP50的改善作用分别为2.4%、1.2%、2.6%和1.9%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Automated Concrete Bridge Damage Detection Using an Efficient Vision Transformer-Enhanced Anchor-Free YOLO
Deep learning techniques have recently been the most popular method for automatically detecting bridge damage captured by unmanned aerial vehicles (UAVs). However, their wider application to real-world scenarios is hindered by three challenges: ① defect scale variance, motion blur, and strong illumination significantly affect the accuracy and reliability of damage detectors; ② existing commonly used anchor-based damage detectors struggle to effectively generalize to harsh real-world scenarios; and ③ convolutional neural networks (CNNs) lack the capability to model long-range dependencies across the entire image. This paper presents an efficient Vision Transformer-enhanced anchor-free YOLO (you only look once) method to address these challenges. First, a concrete bridge damage dataset was established, augmented by motion blur and varying brightness. Four key enhancements were then applied to an anchor-based YOLO method: ① Four detection heads were introduced to alleviate the multi-scale damage detection issue; ② decoupled heads were employed to address the conflict between classification and bounding box regression tasks inherent in the original coupled head design; ③ an anchor-free mechanism was incorporated to reduce the computational complexity and improve generalization to real-world scenarios; and ④ a novel Vision Transformer block, C3MaxViT, was added to enable CNNs to model long-range dependencies. These enhancements were integrated into an advanced anchor-based YOLOv5l algorithm, and the proposed Vision Transformer-enhanced anchor-free YOLO method was then compared against cutting-edge damage detection methods. The experimental results demonstrated the effectiveness of the proposed method, with an increase of 8.1% in mean average precision at intersection over union threshold of 0.5 (mAP50) and an improvement of 8.4% in mAP@[0.5:.05:.95] respectively. Furthermore, extensive ablation studies revealed that the four detection heads, decoupled head design, anchor-free mechanism, and C3MaxViT contributed improvements of 2.4%, 1.2%, 2.6%, and 1.9% in mAP50, respectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Engineering
Engineering Environmental Science-Environmental Engineering
自引率
1.60%
发文量
335
审稿时长
35 days
期刊介绍: Engineering, an international open-access journal initiated by the Chinese Academy of Engineering (CAE) in 2015, serves as a distinguished platform for disseminating cutting-edge advancements in engineering R&D, sharing major research outputs, and highlighting key achievements worldwide. The journal's objectives encompass reporting progress in engineering science, fostering discussions on hot topics, addressing areas of interest, challenges, and prospects in engineering development, while considering human and environmental well-being and ethics in engineering. It aims to inspire breakthroughs and innovations with profound economic and social significance, propelling them to advanced international standards and transforming them into a new productive force. Ultimately, this endeavor seeks to bring about positive changes globally, benefit humanity, and shape a new future.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信