Receptive field enhancement and attention feature fusion network for underwater object detection

IF 1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging Pub Date : 2024-05-01 DOI:10.1117/1.jei.33.3.033007

Huipu Xu, Zegang He, Shuo Chen

{"title":"Receptive field enhancement and attention feature fusion network for underwater object detection","authors":"Huipu Xu, Zegang He, Shuo Chen","doi":"10.1117/1.jei.33.3.033007","DOIUrl":null,"url":null,"abstract":"Underwater environments have characteristics such as unclear imaging and complex backgrounds that lead to poor performance when applying mainstream object detection models directly. To improve the accuracy of underwater object detection, we propose an object detection model, RF-YOLO, which uses a receptive field enhancement (RFE) module in the backbone network to finish RFE and extract more effective features. We design the free-channel iterative attention feature fusion module to reconstruct the neck network and fuse different scales of feature layers to achieve cross-channel attention feature fusion. We use Scylla-intersection over union (SIoU) as the loss function of the model, which makes the model converge to the optimal direction of training through the angle cost, distance cost, shape cost, and IoU cost. The network parameters increase after adding modules, and the model is not easy to converge to the optimal state, so we propose a training method that effectively mines the performance of the detection network. Experiments show that the proposed RF-YOLO achieves a mean average precision of 87.56% and 86.39% on the URPC2019 and URPC2020 datasets, respectively. Through comparative experiments and ablation experiments, it was verified that the proposed network model has a higher detection accuracy in complex underwater environments.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"18 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Electronic Imaging","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1117/1.jei.33.3.033007","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Underwater environments have characteristics such as unclear imaging and complex backgrounds that lead to poor performance when applying mainstream object detection models directly. To improve the accuracy of underwater object detection, we propose an object detection model, RF-YOLO, which uses a receptive field enhancement (RFE) module in the backbone network to finish RFE and extract more effective features. We design the free-channel iterative attention feature fusion module to reconstruct the neck network and fuse different scales of feature layers to achieve cross-channel attention feature fusion. We use Scylla-intersection over union (SIoU) as the loss function of the model, which makes the model converge to the optimal direction of training through the angle cost, distance cost, shape cost, and IoU cost. The network parameters increase after adding modules, and the model is not easy to converge to the optimal state, so we propose a training method that effectively mines the performance of the detection network. Experiments show that the proposed RF-YOLO achieves a mean average precision of 87.56% and 86.39% on the URPC2019 and URPC2020 datasets, respectively. Through comparative experiments and ablation experiments, it was verified that the proposed network model has a higher detection accuracy in complex underwater environments.

查看原文本刊更多论文

用于水下物体探测的感知场增强和注意力特征融合网络

水下环境具有成像不清晰、背景复杂等特点，直接应用主流的物体检测模型性能较差。为了提高水下物体检测的准确性，我们提出了一种物体检测模型 RF-YOLO，它在骨干网络中使用感受野增强（RFE）模块来完成 RFE 并提取更有效的特征。我们设计了自由通道迭代注意力特征融合模块来重构颈部网络，并融合不同尺度的特征层，实现跨通道注意力特征融合。我们使用Scylla-intersection over union（SIoU）作为模型的损失函数，通过角度代价、距离代价、形状代价和IoU代价使模型收敛到最佳训练方向。增加模块后网络参数增加，模型不易收敛到最优状态，因此我们提出了一种训练方法，有效地挖掘了检测网络的性能。实验表明，所提出的 RF-YOLO 在 URPC2019 和 URPC2020 数据集上的平均精度分别达到了 87.56% 和 86.39%。通过对比实验和烧蚀实验，验证了所提出的网络模型在复杂水下环境中具有更高的检测精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Electronic Imaging 工程技术-成像科学与照相技术

CiteScore

1.70

自引率

27.30%

发文量

341

审稿时长

4.0 months

期刊介绍： The Journal of Electronic Imaging publishes peer-reviewed papers in all technology areas that make up the field of electronic imaging and are normally considered in the design, engineering, and applications of electronic imaging systems.