YOLO-MRS: An efficient deep learning-based maritime object detection method for unmanned surface vehicles

IF 4.3 2区工程技术 Q1 ENGINEERING, OCEAN

Applied Ocean Research Pub Date : 2024-09-28 DOI:10.1016/j.apor.2024.104240

Changdong Yu , Haoke Yin , Chenyi Rong , Jiayi Zhao , Xiao Liang , Ruijie Li , Xinrong Mo

{"title":"YOLO-MRS: An efficient deep learning-based maritime object detection method for unmanned surface vehicles","authors":"Changdong Yu , Haoke Yin , Chenyi Rong , Jiayi Zhao , Xiao Liang , Ruijie Li , Xinrong Mo","doi":"10.1016/j.apor.2024.104240","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning-based object detection for an unmanned surface vehicle (USV) is an important way of visual perception. However, current methods perform poorly when performing complex maritime object detection tasks. It also lacks available datasets of complex maritime objects for visual perception system of USVs. In order to solve these problems, we propose an improved maritime object detection method, called YOLO-MRS, based on lightweight YOLOv8 model in this paper. Specifically, we introduce a multi-scale cross-axis attention (MCA) mechanism into the backbone network of the model to establish long-distance dependencies between pixels to capture global feature information. In addition, we introduce Simplified Spatial Pyramid Pooling-Fast (SimSPPF) to the backbone to enhance prediction accuracy. Also, considering computational efficiency, we replace the ordinary convolutional layers in the backbone network and neck network with refocused convolutional (RefConv) layers to reduce model parameters. Especially, we construct a maritime object detection dataset, termed MODD-13, which contains over 9000 precisely annotated images. The proposed MODD-13 sufficiently considers the characteristics of object categories (13 types), image diversity, sample independence, and background confusion, and can be used as a benchmark dataset for maritime object detection. The final experimental results show that compared with the representative YOLO series models, YOLO-MRS improves the average mAP50 accuracy by 1.8%–7% and mAP50-95 by 1.1%–11.5%, and effectively balances detection accuracy and computational efficiency, thereby effectively achieving fast and accurate detection of maritime objects.</div></div>","PeriodicalId":8261,"journal":{"name":"Applied Ocean Research","volume":"153 ","pages":"Article 104240"},"PeriodicalIF":4.3000,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Ocean Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141118724003614","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, OCEAN","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning-based object detection for an unmanned surface vehicle (USV) is an important way of visual perception. However, current methods perform poorly when performing complex maritime object detection tasks. It also lacks available datasets of complex maritime objects for visual perception system of USVs. In order to solve these problems, we propose an improved maritime object detection method, called YOLO-MRS, based on lightweight YOLOv8 model in this paper. Specifically, we introduce a multi-scale cross-axis attention (MCA) mechanism into the backbone network of the model to establish long-distance dependencies between pixels to capture global feature information. In addition, we introduce Simplified Spatial Pyramid Pooling-Fast (SimSPPF) to the backbone to enhance prediction accuracy. Also, considering computational efficiency, we replace the ordinary convolutional layers in the backbone network and neck network with refocused convolutional (RefConv) layers to reduce model parameters. Especially, we construct a maritime object detection dataset, termed MODD-13, which contains over 9000 precisely annotated images. The proposed MODD-13 sufficiently considers the characteristics of object categories (13 types), image diversity, sample independence, and background confusion, and can be used as a benchmark dataset for maritime object detection. The final experimental results show that compared with the representative YOLO series models, YOLO-MRS improves the average mAP50 accuracy by 1.8%–7% and mAP50-95 by 1.1%–11.5%, and effectively balances detection accuracy and computational efficiency, thereby effectively achieving fast and accurate detection of maritime objects.

查看原文本刊更多论文

YOLO-MRS：基于深度学习的高效无人水面飞行器海洋物体检测方法

基于深度学习的无人水面航行器（USV）目标检测是视觉感知的一种重要方式。然而，目前的方法在执行复杂的海上物体检测任务时表现不佳。同时，USV 的视觉感知系统也缺乏可用的复杂海洋物体数据集。为了解决这些问题，我们在本文中提出了一种基于轻量级 YOLOv8 模型的改进型海洋物体检测方法，称为 YOLO-MRS。具体来说，我们在模型的骨干网络中引入了多尺度跨轴关注（MCA）机制，以建立像素间的远距离依赖关系，从而捕捉全局特征信息。此外，我们还在骨干网络中引入了简化空间金字塔池化-快速（SimSPPF）机制，以提高预测精度。同时，考虑到计算效率，我们将骨干网络和颈部网络中的普通卷积层替换为重新聚焦卷积层（RefConv），以减少模型参数。特别是，我们构建了一个包含 9000 多张精确注释图像的海洋物体检测数据集，称为 MODD-13。提出的 MODD-13 充分考虑了物体类别（13 种）、图像多样性、样本独立性和背景混淆等特点，可作为海上物体检测的基准数据集。最终实验结果表明，与具有代表性的 YOLO 系列模型相比，YOLO-MRS 的 mAP50 平均精度提高了 1.8%-7%，mAP50-95 平均精度提高了 1.1%-11.5%，有效平衡了检测精度和计算效率，从而有效实现了对海洋物体的快速准确检测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Ocean Research 地学-工程：大洋

CiteScore

8.70

自引率

7.00%

发文量

316

审稿时长

59 days

期刊介绍： The aim of Applied Ocean Research is to encourage the submission of papers that advance the state of knowledge in a range of topics relevant to ocean engineering.