LRSNet：用于遥感中物体探测的高效轻量级模型

IF 1.4 4区地球科学 Q4 ENVIRONMENTAL SCIENCES

Journal of Applied Remote Sensing Pub Date : 2024-01-01 DOI:10.1117/1.jrs.18.016502

Shiliang Zhu, Min Miao, Yutong Wang

{"title":"LRSNet：用于遥感中物体探测的高效轻量级模型","authors":"Shiliang Zhu, Min Miao, Yutong Wang","doi":"10.1117/1.jrs.18.016502","DOIUrl":null,"url":null,"abstract":"Unmanned aerial vehicles (UAVs) exhibit the ability to flexibly conduct aerial remote-sensing imaging. By employing deep learning object-detection algorithms, they efficiently perceive objects, finding widespread application in various practical engineering tasks. Consequently, UAV-based remote sensing object detection technology holds considerable research value. However, the background of UAV remote sensing images is often complex, with varying shooting angles and heights leading to difficulties in unifying target scales and features. Moreover, there is the challenge of numerous densely distributed small targets. In addition, UAVs face significant limitations in terms of hardware resources. Against this background, we propose a lightweight remote sensing object detection network (LRSNet) model based on YOLOv5s. In the backbone of LRSNet, the lightweight network MobileNetV3 is used to substantially reduce the model’s computational complexity and parameter count. In the model’s neck, a multiscale feature pyramid network named CM-FPN is introduced to enhance the detection capability of small objects. CM-FPN comprises two key components: C3EGhost, based on GhostNet and efficient channel attention modules, and the multiscale feature fusion channel attention mechanism (MFFC). C3EGhost, serving as CM-FPN’s primary feature extraction module, possesses lower computational complexity and fewer parameters, as well as effectively reducing background interference. MFFC, as the feature fusion node of CM-FPN, can adaptively weight the fusion of shallow and deep features, acquiring more effective details and semantic information for object detection. LRSNet, evaluated on the NWPU VHR-10, DOTA V1.0, and VisDrone-2019 datasets, achieved mean average precision of 94.0%, 71.9%, and 35.6%, with Giga floating-point operations per second and Param (M) measuring only 5.8 and 4.1, respectively. This outcome affirms the efficiency of LRSNet in UAV-based remote-sensing object detection tasks.","PeriodicalId":54879,"journal":{"name":"Journal of Applied Remote Sensing","volume":"21 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LRSNet: a high-efficiency lightweight model for object detection in remote sensing\",\"authors\":\"Shiliang Zhu, Min Miao, Yutong Wang\",\"doi\":\"10.1117/1.jrs.18.016502\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Unmanned aerial vehicles (UAVs) exhibit the ability to flexibly conduct aerial remote-sensing imaging. By employing deep learning object-detection algorithms, they efficiently perceive objects, finding widespread application in various practical engineering tasks. Consequently, UAV-based remote sensing object detection technology holds considerable research value. However, the background of UAV remote sensing images is often complex, with varying shooting angles and heights leading to difficulties in unifying target scales and features. Moreover, there is the challenge of numerous densely distributed small targets. In addition, UAVs face significant limitations in terms of hardware resources. Against this background, we propose a lightweight remote sensing object detection network (LRSNet) model based on YOLOv5s. In the backbone of LRSNet, the lightweight network MobileNetV3 is used to substantially reduce the model’s computational complexity and parameter count. In the model’s neck, a multiscale feature pyramid network named CM-FPN is introduced to enhance the detection capability of small objects. CM-FPN comprises two key components: C3EGhost, based on GhostNet and efficient channel attention modules, and the multiscale feature fusion channel attention mechanism (MFFC). C3EGhost, serving as CM-FPN’s primary feature extraction module, possesses lower computational complexity and fewer parameters, as well as effectively reducing background interference. MFFC, as the feature fusion node of CM-FPN, can adaptively weight the fusion of shallow and deep features, acquiring more effective details and semantic information for object detection. LRSNet, evaluated on the NWPU VHR-10, DOTA V1.0, and VisDrone-2019 datasets, achieved mean average precision of 94.0%, 71.9%, and 35.6%, with Giga floating-point operations per second and Param (M) measuring only 5.8 and 4.1, respectively. This outcome affirms the efficiency of LRSNet in UAV-based remote-sensing object detection tasks.\",\"PeriodicalId\":54879,\"journal\":{\"name\":\"Journal of Applied Remote Sensing\",\"volume\":\"21 1\",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Applied Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1117/1.jrs.18.016502\",\"RegionNum\":4,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1117/1.jrs.18.016502","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

无人飞行器（UAV）具有灵活进行空中遥感成像的能力。通过采用深度学习物体检测算法，无人飞行器可以高效地感知物体，并在各种实际工程任务中得到广泛应用。因此，基于无人机的遥感物体探测技术具有相当高的研究价值。然而，无人机遥感图像的背景往往十分复杂，拍摄角度和高度各不相同，导致目标尺度和特征难以统一。此外，还有众多密集分布的小型目标的挑战。此外，无人机在硬件资源方面也面临很大限制。在此背景下，我们提出了基于 YOLOv5s 的轻量级遥感目标检测网络（LRSNet）模型。在 LRSNet 的骨干网中，使用了轻量级网络 MobileNetV3，从而大大降低了模型的计算复杂度和参数数量。在模型的颈部，引入了名为 CM-FPN 的多尺度特征金字塔网络，以增强对小型物体的检测能力。CM-FPN 包括两个关键组件：基于 GhostNet 和高效通道注意模块的 C3EGhost，以及多尺度特征融合通道注意机制（MFFC）。C3EGhost 作为 CM-FPN 的主要特征提取模块，具有较低的计算复杂度和较少的参数，并能有效减少背景干扰。MFFC 作为 CM-FPN 的特征融合节点，可以自适应地加权融合浅层和深层特征，为物体检测获取更有效的细节和语义信息。在NWPU VHR-10、DOTA V1.0和VisDrone-2019数据集上评估的LRSNet的平均精度分别为94.0%、71.9%和35.6%，每秒千兆浮点运算和参数（M）分别仅为5.8和4.1。这一结果肯定了 LRSNet 在无人机遥感物体检测任务中的效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

LRSNet: a high-efficiency lightweight model for object detection in remote sensing

Unmanned aerial vehicles (UAVs) exhibit the ability to flexibly conduct aerial remote-sensing imaging. By employing deep learning object-detection algorithms, they efficiently perceive objects, finding widespread application in various practical engineering tasks. Consequently, UAV-based remote sensing object detection technology holds considerable research value. However, the background of UAV remote sensing images is often complex, with varying shooting angles and heights leading to difficulties in unifying target scales and features. Moreover, there is the challenge of numerous densely distributed small targets. In addition, UAVs face significant limitations in terms of hardware resources. Against this background, we propose a lightweight remote sensing object detection network (LRSNet) model based on YOLOv5s. In the backbone of LRSNet, the lightweight network MobileNetV3 is used to substantially reduce the model’s computational complexity and parameter count. In the model’s neck, a multiscale feature pyramid network named CM-FPN is introduced to enhance the detection capability of small objects. CM-FPN comprises two key components: C3EGhost, based on GhostNet and efficient channel attention modules, and the multiscale feature fusion channel attention mechanism (MFFC). C3EGhost, serving as CM-FPN’s primary feature extraction module, possesses lower computational complexity and fewer parameters, as well as effectively reducing background interference. MFFC, as the feature fusion node of CM-FPN, can adaptively weight the fusion of shallow and deep features, acquiring more effective details and semantic information for object detection. LRSNet, evaluated on the NWPU VHR-10, DOTA V1.0, and VisDrone-2019 datasets, achieved mean average precision of 94.0%, 71.9%, and 35.6%, with Giga floating-point operations per second and Param (M) measuring only 5.8 and 4.1, respectively. This outcome affirms the efficiency of LRSNet in UAV-based remote-sensing object detection tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Applied Remote Sensing 环境科学-成像科学与照相技术

CiteScore

3.40

自引率

11.80%

发文量

194

审稿时长

3 months

期刊介绍： The Journal of Applied Remote Sensing is a peer-reviewed journal that optimizes the communication of concepts, information, and progress among the remote sensing community.