Huixia Zhang, Xuhui Jiang, Yitong Liu, JinHua Qian, Lixue Ni
{"title":"Improvement of Dam Crack Detection Algorithm for YOLOv9","authors":"Huixia Zhang, Xuhui Jiang, Yitong Liu, JinHua Qian, Lixue Ni","doi":"10.1049/ipr2.70124","DOIUrl":null,"url":null,"abstract":"<p>Dams, as crucial water conservancy engineering facilities, play a role in safe guarding people's livelihoods and providing economic benefits. However, due to the impact of natural factors and human activities, dams may develop cracks and other potential safety hazards during operation. Crack detection can identify these potential issues in a timely manner, allowing for appropriate measures to be taken for repair and reinforcement, thereby preventing catastrophic consequences such as dam breaches under extreme weather or geological conditions. In the process of dam crack detection, this paper presents a method, YOLOv9-LAE, which may solve missed or false detections. Firstly, the large separable kernel attention (LSKA) module is introduced, which emphasises positional information while focusing on channel features. Secondly, the SPPFELAN in YOLOV9 is replaced by the AIFI module, as capturing the key information needed in the image will enable the following modules to accurately detect the crack information. Finally, the EIOU to calculate the loss, accelerating training convergence and improving the accuracy of crack detection. The research results indicate that YOLOV9-LAE achieves a precision of 90.7%, the recall rate is 75.1%, with <span></span><math>\n <semantics>\n <mrow>\n <mi>m</mi>\n <mi>A</mi>\n <mi>P</mi>\n <mo>@</mo>\n <mn>0.5</mn>\n </mrow>\n <annotation>[email protected]$</annotation>\n </semantics></math> at 81.5% and <span></span><math>\n <semantics>\n <mrow>\n <mi>m</mi>\n <mi>A</mi>\n <mi>P</mi>\n <mo>@</mo>\n <mn>0.5</mn>\n <mo>:</mo>\n <mn>0.95</mn>\n </mrow>\n <annotation>[email protected]:0.95$</annotation>\n </semantics></math> at 60.6%. Compared to YOLOv9, the precision has improved by 9.9%, the recall has increased by 2%, <span></span><math>\n <semantics>\n <mrow>\n <mi>m</mi>\n <mi>A</mi>\n <mi>P</mi>\n <mo>@</mo>\n <mn>0.5</mn>\n </mrow>\n <annotation>[email protected]$</annotation>\n </semantics></math> has risen by 1.5% and <span></span><math>\n <semantics>\n <mrow>\n <mi>m</mi>\n <mi>A</mi>\n <mi>P</mi>\n <mo>@</mo>\n <mn>0.5</mn>\n <mo>:</mo>\n <mn>0.95</mn>\n </mrow>\n <annotation>[email protected]:0.95$</annotation>\n </semantics></math> has been enhanced by 1.5%.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70124","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Processing","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70124","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Dams, as crucial water conservancy engineering facilities, play a role in safe guarding people's livelihoods and providing economic benefits. However, due to the impact of natural factors and human activities, dams may develop cracks and other potential safety hazards during operation. Crack detection can identify these potential issues in a timely manner, allowing for appropriate measures to be taken for repair and reinforcement, thereby preventing catastrophic consequences such as dam breaches under extreme weather or geological conditions. In the process of dam crack detection, this paper presents a method, YOLOv9-LAE, which may solve missed or false detections. Firstly, the large separable kernel attention (LSKA) module is introduced, which emphasises positional information while focusing on channel features. Secondly, the SPPFELAN in YOLOV9 is replaced by the AIFI module, as capturing the key information needed in the image will enable the following modules to accurately detect the crack information. Finally, the EIOU to calculate the loss, accelerating training convergence and improving the accuracy of crack detection. The research results indicate that YOLOV9-LAE achieves a precision of 90.7%, the recall rate is 75.1%, with at 81.5% and at 60.6%. Compared to YOLOv9, the precision has improved by 9.9%, the recall has increased by 2%, has risen by 1.5% and has been enhanced by 1.5%.
大坝作为重要的水利工程设施,具有安全保障民生和提供经济效益的作用。然而,由于自然因素和人为活动的影响,大坝在运行过程中可能出现裂缝等安全隐患。裂缝检测可以及时发现这些潜在的问题,从而采取适当的措施进行修复和加固,从而防止在极端天气或地质条件下发生大坝决口等灾难性后果。在大坝裂缝检测过程中,本文提出了一种YOLOv9-LAE方法,可以很好地解决漏检和误检问题。首先,介绍了大可分离核注意(large分离式核注意)模块,该模块在关注信道特征的同时强调位置信息;其次,将YOLOV9中的SPPFELAN替换为AIFI模块,因为捕获图像中所需的关键信息将使后续模块能够准确地检测到裂纹信息。最后,利用EIOU来计算损失,加速训练收敛,提高裂纹检测的精度。研究结果表明,YOLOV9-LAE的准确率为90.7%,召回率为75.1%;m A P @ 0.5 [email protected]$ 81.5%, m A P @ 0.5:0.95 [email protected]:0.95美元,占60.6%。与YOLOv9相比,精度提高了9.9%,召回率提高了2%,m A P @ 0.5 [email protected]$提高了1.5%,m A P @ 0.5:0.95 [email protected]:0.95美元上涨了1.5%。
期刊介绍:
The IET Image Processing journal encompasses research areas related to the generation, processing and communication of visual information. The focus of the journal is the coverage of the latest research results in image and video processing, including image generation and display, enhancement and restoration, segmentation, colour and texture analysis, coding and communication, implementations and architectures as well as innovative applications.
Principal topics include:
Generation and Display - Imaging sensors and acquisition systems, illumination, sampling and scanning, quantization, colour reproduction, image rendering, display and printing systems, evaluation of image quality.
Processing and Analysis - Image enhancement, restoration, segmentation, registration, multispectral, colour and texture processing, multiresolution processing and wavelets, morphological operations, stereoscopic and 3-D processing, motion detection and estimation, video and image sequence processing.
Implementations and Architectures - Image and video processing hardware and software, design and construction, architectures and software, neural, adaptive, and fuzzy processing.
Coding and Transmission - Image and video compression and coding, compression standards, noise modelling, visual information networks, streamed video.
Retrieval and Multimedia - Storage of images and video, database design, image retrieval, video annotation and editing, mixed media incorporating visual information, multimedia systems and applications, image and video watermarking, steganography.
Applications - Innovative application of image and video processing technologies to any field, including life sciences, earth sciences, astronomy, document processing and security.
Current Special Issue Call for Papers:
Evolutionary Computation for Image Processing - https://digital-library.theiet.org/files/IET_IPR_CFP_EC.pdf
AI-Powered 3D Vision - https://digital-library.theiet.org/files/IET_IPR_CFP_AIPV.pdf
Multidisciplinary advancement of Imaging Technologies: From Medical Diagnostics and Genomics to Cognitive Machine Vision, and Artificial Intelligence - https://digital-library.theiet.org/files/IET_IPR_CFP_IST.pdf
Deep Learning for 3D Reconstruction - https://digital-library.theiet.org/files/IET_IPR_CFP_DLR.pdf