Peixue Liu, Mingze Sun, Xinyue Han, Shu Liu, Yujie Chen, Han Zhang
{"title":"A High-Accuracy YOLOv8-ResAttNet Framework for Maritime Vessel Detection Using Residual Attention","authors":"Peixue Liu, Mingze Sun, Xinyue Han, Shu Liu, Yujie Chen, Han Zhang","doi":"10.1049/ipr2.70085","DOIUrl":null,"url":null,"abstract":"<p>Against the backdrop of constantly upgrading maritime security requirements and dynamic marine environments, satellite based ship detection has become a key technology for national maritime surveillance, resource management, and environmental protection. However, existing methods often struggle to address ongoing challenges, including insufficient sensitivity to small vessels and susceptibility to errors or missed detections in complex ocean backgrounds caused by wave reflections, cloud cover, and lighting changes. To address these limitations, this study proposes YOLOv8 ResAttNet, an enhanced model that integrates residual learning and attention mechanisms into the YOLOv8 framework. The core innovation lies in a custom designed backbone network that combines multi-scale feature aggregation with an improved ICBAM attention module to achieve precise localization of ship targets while suppressing irrelevant background noise. This architecture dynamically recalibrates feature channel weights through residual attention blocks, enhancing the model's ability to distinguish subtle ship features (such as hull contours and superstructures) in different maritime scenarios. Extensive experiments on high-resolution HRSID datasets have demonstrated the superiority of this model: the average accuracy (mAP50) of YOLOv8 ResAttNet is 95.2%, which is 4.9% higher than the original YOLOv8 and over 4% higher than state-of-the-art models such as YOLO SENet and YOLO11. These improvements highlight its robustness in handling scale changes and complex background interference. The research results emphasize the effectiveness of combining residual connectivity with attention driven feature refinement for maritime target detection, especially in small target scenes. This work not only advances the technological frontier of remote sensing image analysis, but also provides a scalable framework for real-world applications such as illegal fishing monitoring, maritime traffic management, and disaster response. Future research directions include extending the model to multimodal satellite data fusion, optimizing the computational efficiency of edge device deployment, and further bridging the gap between theoretical innovation and maritime surveillance systems.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70085","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Processing","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70085","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Against the backdrop of constantly upgrading maritime security requirements and dynamic marine environments, satellite based ship detection has become a key technology for national maritime surveillance, resource management, and environmental protection. However, existing methods often struggle to address ongoing challenges, including insufficient sensitivity to small vessels and susceptibility to errors or missed detections in complex ocean backgrounds caused by wave reflections, cloud cover, and lighting changes. To address these limitations, this study proposes YOLOv8 ResAttNet, an enhanced model that integrates residual learning and attention mechanisms into the YOLOv8 framework. The core innovation lies in a custom designed backbone network that combines multi-scale feature aggregation with an improved ICBAM attention module to achieve precise localization of ship targets while suppressing irrelevant background noise. This architecture dynamically recalibrates feature channel weights through residual attention blocks, enhancing the model's ability to distinguish subtle ship features (such as hull contours and superstructures) in different maritime scenarios. Extensive experiments on high-resolution HRSID datasets have demonstrated the superiority of this model: the average accuracy (mAP50) of YOLOv8 ResAttNet is 95.2%, which is 4.9% higher than the original YOLOv8 and over 4% higher than state-of-the-art models such as YOLO SENet and YOLO11. These improvements highlight its robustness in handling scale changes and complex background interference. The research results emphasize the effectiveness of combining residual connectivity with attention driven feature refinement for maritime target detection, especially in small target scenes. This work not only advances the technological frontier of remote sensing image analysis, but also provides a scalable framework for real-world applications such as illegal fishing monitoring, maritime traffic management, and disaster response. Future research directions include extending the model to multimodal satellite data fusion, optimizing the computational efficiency of edge device deployment, and further bridging the gap between theoretical innovation and maritime surveillance systems.
期刊介绍:
The IET Image Processing journal encompasses research areas related to the generation, processing and communication of visual information. The focus of the journal is the coverage of the latest research results in image and video processing, including image generation and display, enhancement and restoration, segmentation, colour and texture analysis, coding and communication, implementations and architectures as well as innovative applications.
Principal topics include:
Generation and Display - Imaging sensors and acquisition systems, illumination, sampling and scanning, quantization, colour reproduction, image rendering, display and printing systems, evaluation of image quality.
Processing and Analysis - Image enhancement, restoration, segmentation, registration, multispectral, colour and texture processing, multiresolution processing and wavelets, morphological operations, stereoscopic and 3-D processing, motion detection and estimation, video and image sequence processing.
Implementations and Architectures - Image and video processing hardware and software, design and construction, architectures and software, neural, adaptive, and fuzzy processing.
Coding and Transmission - Image and video compression and coding, compression standards, noise modelling, visual information networks, streamed video.
Retrieval and Multimedia - Storage of images and video, database design, image retrieval, video annotation and editing, mixed media incorporating visual information, multimedia systems and applications, image and video watermarking, steganography.
Applications - Innovative application of image and video processing technologies to any field, including life sciences, earth sciences, astronomy, document processing and security.
Current Special Issue Call for Papers:
Evolutionary Computation for Image Processing - https://digital-library.theiet.org/files/IET_IPR_CFP_EC.pdf
AI-Powered 3D Vision - https://digital-library.theiet.org/files/IET_IPR_CFP_AIPV.pdf
Multidisciplinary advancement of Imaging Technologies: From Medical Diagnostics and Genomics to Cognitive Machine Vision, and Artificial Intelligence - https://digital-library.theiet.org/files/IET_IPR_CFP_IST.pdf
Deep Learning for 3D Reconstruction - https://digital-library.theiet.org/files/IET_IPR_CFP_DLR.pdf