Xin Wen;Feihu Zhang;Chensheng Cheng;Xujia Hou;Guang Pan
{"title":"Side-Scan Sonar Underwater Target Detection: Combining the Diffusion Model With an Improved YOLOv7 Model","authors":"Xin Wen;Feihu Zhang;Chensheng Cheng;Xujia Hou;Guang Pan","doi":"10.1109/JOE.2024.3379481","DOIUrl":null,"url":null,"abstract":"Side-scan sonar (SSS) plays a crucial role in underwater exploration. Autonomous analysis of SSS images is vital for detecting unknown targets in underwater environments. However, due to the complexity of the underwater environment, few highlighted areas of the target, blurred feature details, and the difficulty of collecting data from SSS, achieving high-precision autonomous target recognition in SSS images is challenging. This article solves this problem by improving the You Only Look Once v7 (YOLOv7) model to achieve high-precision object detection in SSS images. First, we enhance and enlarge real and experimental images using the denoising–diffusion model to establish a self-made SSS image data set, as there are data pictures of the detection target in the SSS images obtained from real experiments. Since the SSS image has large areas without targets, this article introduces a vision transformer (ViT) for dynamic attention and global modeling, which improves the model's weight in the target region. Second, the convolutional block attention module is adopted to further improve the feature expression ability and reduce floating-point operations. Finally, this article uses Scylla-Intersection over Union as the loss function to increase the accuracy of the model's inference. Experiments on the SSS image data set demonstrate that the improved YOLOv7 model outperforms other technologies, with an average accuracy (mAP0.5) and (mAP0.5:0.95) of 78.00% and 48.11%, respectively. These results are 3.47% and 2.9% higher than the YOLOv7 model. The improved YOLOv7 algorithm proposed in this article has great potential for object detection and recognition of SSS images.","PeriodicalId":13191,"journal":{"name":"IEEE Journal of Oceanic Engineering","volume":"49 3","pages":"976-991"},"PeriodicalIF":3.8000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Oceanic Engineering","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10534346/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0
Abstract
Side-scan sonar (SSS) plays a crucial role in underwater exploration. Autonomous analysis of SSS images is vital for detecting unknown targets in underwater environments. However, due to the complexity of the underwater environment, few highlighted areas of the target, blurred feature details, and the difficulty of collecting data from SSS, achieving high-precision autonomous target recognition in SSS images is challenging. This article solves this problem by improving the You Only Look Once v7 (YOLOv7) model to achieve high-precision object detection in SSS images. First, we enhance and enlarge real and experimental images using the denoising–diffusion model to establish a self-made SSS image data set, as there are data pictures of the detection target in the SSS images obtained from real experiments. Since the SSS image has large areas without targets, this article introduces a vision transformer (ViT) for dynamic attention and global modeling, which improves the model's weight in the target region. Second, the convolutional block attention module is adopted to further improve the feature expression ability and reduce floating-point operations. Finally, this article uses Scylla-Intersection over Union as the loss function to increase the accuracy of the model's inference. Experiments on the SSS image data set demonstrate that the improved YOLOv7 model outperforms other technologies, with an average accuracy (mAP0.5) and (mAP0.5:0.95) of 78.00% and 48.11%, respectively. These results are 3.47% and 2.9% higher than the YOLOv7 model. The improved YOLOv7 algorithm proposed in this article has great potential for object detection and recognition of SSS images.
期刊介绍:
The IEEE Journal of Oceanic Engineering (ISSN 0364-9059) is the online-only quarterly publication of the IEEE Oceanic Engineering Society (IEEE OES). The scope of the Journal is the field of interest of the IEEE OES, which encompasses all aspects of science, engineering, and technology that address research, development, and operations pertaining to all bodies of water. This includes the creation of new capabilities and technologies from concept design through prototypes, testing, and operational systems to sense, explore, understand, develop, use, and responsibly manage natural resources.