A High-Accuracy YOLOv8-ResAttNet Framework for Maritime Vessel Detection Using Residual Attention

IF 2 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Peixue Liu, Mingze Sun, Xinyue Han, Shu Liu, Yujie Chen, Han Zhang
{"title":"A High-Accuracy YOLOv8-ResAttNet Framework for Maritime Vessel Detection Using Residual Attention","authors":"Peixue Liu,&nbsp;Mingze Sun,&nbsp;Xinyue Han,&nbsp;Shu Liu,&nbsp;Yujie Chen,&nbsp;Han Zhang","doi":"10.1049/ipr2.70085","DOIUrl":null,"url":null,"abstract":"<p>Against the backdrop of constantly upgrading maritime security requirements and dynamic marine environments, satellite based ship detection has become a key technology for national maritime surveillance, resource management, and environmental protection. However, existing methods often struggle to address ongoing challenges, including insufficient sensitivity to small vessels and susceptibility to errors or missed detections in complex ocean backgrounds caused by wave reflections, cloud cover, and lighting changes. To address these limitations, this study proposes YOLOv8 ResAttNet, an enhanced model that integrates residual learning and attention mechanisms into the YOLOv8 framework. The core innovation lies in a custom designed backbone network that combines multi-scale feature aggregation with an improved ICBAM attention module to achieve precise localization of ship targets while suppressing irrelevant background noise. This architecture dynamically recalibrates feature channel weights through residual attention blocks, enhancing the model's ability to distinguish subtle ship features (such as hull contours and superstructures) in different maritime scenarios. Extensive experiments on high-resolution HRSID datasets have demonstrated the superiority of this model: the average accuracy (mAP50) of YOLOv8 ResAttNet is 95.2%, which is 4.9% higher than the original YOLOv8 and over 4% higher than state-of-the-art models such as YOLO SENet and YOLO11. These improvements highlight its robustness in handling scale changes and complex background interference. The research results emphasize the effectiveness of combining residual connectivity with attention driven feature refinement for maritime target detection, especially in small target scenes. This work not only advances the technological frontier of remote sensing image analysis, but also provides a scalable framework for real-world applications such as illegal fishing monitoring, maritime traffic management, and disaster response. Future research directions include extending the model to multimodal satellite data fusion, optimizing the computational efficiency of edge device deployment, and further bridging the gap between theoretical innovation and maritime surveillance systems.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70085","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Processing","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70085","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Against the backdrop of constantly upgrading maritime security requirements and dynamic marine environments, satellite based ship detection has become a key technology for national maritime surveillance, resource management, and environmental protection. However, existing methods often struggle to address ongoing challenges, including insufficient sensitivity to small vessels and susceptibility to errors or missed detections in complex ocean backgrounds caused by wave reflections, cloud cover, and lighting changes. To address these limitations, this study proposes YOLOv8 ResAttNet, an enhanced model that integrates residual learning and attention mechanisms into the YOLOv8 framework. The core innovation lies in a custom designed backbone network that combines multi-scale feature aggregation with an improved ICBAM attention module to achieve precise localization of ship targets while suppressing irrelevant background noise. This architecture dynamically recalibrates feature channel weights through residual attention blocks, enhancing the model's ability to distinguish subtle ship features (such as hull contours and superstructures) in different maritime scenarios. Extensive experiments on high-resolution HRSID datasets have demonstrated the superiority of this model: the average accuracy (mAP50) of YOLOv8 ResAttNet is 95.2%, which is 4.9% higher than the original YOLOv8 and over 4% higher than state-of-the-art models such as YOLO SENet and YOLO11. These improvements highlight its robustness in handling scale changes and complex background interference. The research results emphasize the effectiveness of combining residual connectivity with attention driven feature refinement for maritime target detection, especially in small target scenes. This work not only advances the technological frontier of remote sensing image analysis, but also provides a scalable framework for real-world applications such as illegal fishing monitoring, maritime traffic management, and disaster response. Future research directions include extending the model to multimodal satellite data fusion, optimizing the computational efficiency of edge device deployment, and further bridging the gap between theoretical innovation and maritime surveillance systems.

Abstract Image

基于剩余注意的高精度YOLOv8-ResAttNet海事船舶检测框架
在海上安全要求不断提升和海洋环境不断变化的背景下,卫星舰船探测已成为国家海上监视、资源管理和环境保护的关键技术。然而,现有的方法往往难以解决持续存在的挑战,包括对小型船只的灵敏度不足,以及在波浪反射、云层覆盖和光照变化引起的复杂海洋背景下容易出现错误或漏检。为了解决这些限制,本研究提出了YOLOv8 ResAttNet,这是一个将剩余学习和注意机制集成到YOLOv8框架中的增强模型。核心创新点在于定制化设计骨干网,将多尺度特征聚合与改进的ICBAM关注模块相结合,在抑制无关背景噪声的同时实现舰船目标的精确定位。该架构通过剩余注意块动态地重新校准特征通道权重,增强模型在不同海事场景中区分细微船舶特征(如船体轮廓和上层建筑)的能力。在高分辨率HRSID数据集上的大量实验证明了该模型的优越性:YOLOv8 ResAttNet的平均精度(mAP50)为95.2%,比原来的YOLOv8高4.9%,比最先进的模型如yolosenet和YOLO11高4%以上。这些改进突出了其在处理尺度变化和复杂背景干扰方面的鲁棒性。研究结果强调了残差连通性与注意驱动特征细化相结合在海上目标检测中的有效性,特别是在小目标场景下。这项工作不仅推进了遥感图像分析的技术前沿,而且为非法捕鱼监测、海上交通管理和灾害响应等现实应用提供了可扩展的框架。未来的研究方向包括将模型扩展到多模态卫星数据融合,优化边缘设备部署的计算效率,进一步弥合理论创新与海上监视系统之间的差距。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IET Image Processing
IET Image Processing 工程技术-工程:电子与电气
CiteScore
5.40
自引率
8.70%
发文量
282
审稿时长
6 months
期刊介绍: The IET Image Processing journal encompasses research areas related to the generation, processing and communication of visual information. The focus of the journal is the coverage of the latest research results in image and video processing, including image generation and display, enhancement and restoration, segmentation, colour and texture analysis, coding and communication, implementations and architectures as well as innovative applications. Principal topics include: Generation and Display - Imaging sensors and acquisition systems, illumination, sampling and scanning, quantization, colour reproduction, image rendering, display and printing systems, evaluation of image quality. Processing and Analysis - Image enhancement, restoration, segmentation, registration, multispectral, colour and texture processing, multiresolution processing and wavelets, morphological operations, stereoscopic and 3-D processing, motion detection and estimation, video and image sequence processing. Implementations and Architectures - Image and video processing hardware and software, design and construction, architectures and software, neural, adaptive, and fuzzy processing. Coding and Transmission - Image and video compression and coding, compression standards, noise modelling, visual information networks, streamed video. Retrieval and Multimedia - Storage of images and video, database design, image retrieval, video annotation and editing, mixed media incorporating visual information, multimedia systems and applications, image and video watermarking, steganography. Applications - Innovative application of image and video processing technologies to any field, including life sciences, earth sciences, astronomy, document processing and security. Current Special Issue Call for Papers: Evolutionary Computation for Image Processing - https://digital-library.theiet.org/files/IET_IPR_CFP_EC.pdf AI-Powered 3D Vision - https://digital-library.theiet.org/files/IET_IPR_CFP_AIPV.pdf Multidisciplinary advancement of Imaging Technologies: From Medical Diagnostics and Genomics to Cognitive Machine Vision, and Artificial Intelligence - https://digital-library.theiet.org/files/IET_IPR_CFP_IST.pdf Deep Learning for 3D Reconstruction - https://digital-library.theiet.org/files/IET_IPR_CFP_DLR.pdf
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信