DFLMF-ISTD: Infrared small object detection network based on decoupled feature learning and multi-scale feature fusion

IF 3.1 3区物理与天体物理 Q2 INSTRUMENTS & INSTRUMENTATION

Infrared Physics & Technology Pub Date : 2025-05-09 DOI:10.1016/j.infrared.2025.105851

Ning Li, Daozhi Wei, Shucai Huang, Xirui Xue

{"title":"DFLMF-ISTD: Infrared small object detection network based on decoupled feature learning and multi-scale feature fusion","authors":"Ning Li, Daozhi Wei, Shucai Huang, Xirui Xue","doi":"10.1016/j.infrared.2025.105851","DOIUrl":null,"url":null,"abstract":"<div><div>Infrared small object detection is widely used in small maneuvering object awareness and high threatening object detection and recognition. In recent years, the introduction of deep learning methods has greatly improved the detection performance of infrared small objects. However, the presence of clutter in infrared small object images (low signal-to-noise ratio, SNR) and the lack of shape and texture information for the objects lead to a decrease in detection performance in complex environments. As such, in this article, an infrared small object detection network based on decoupled feature learning and multi-scale feature fusion is proposed. First, utilizing disentangled feature learning, we construct Reversible Column Networks (Revcol) with C3 modules to get RevcolC3 to alleviate the issues of complex feature extraction and the loss of small-scale object information. Second, a new lighted attention spatial pyramid pooling (LASP) module is proposed. By convolving the features extracted from the backbone and performing two consecutive pooling operations, a large kernel separated attention (LSKA) mechanism is introduced to process spatial and channel information separately. This enhances the model’s multi-feature extraction capabilities while reducing computational complexity. Finally, a novel lightweight three-dimensional multi-scale feature fusion (LTDMF) module is designed to efficiently utilize the correlations between three-level pyramid feature maps and effectively extract infrared small object features. This enhances the network’s ability to detect objects while maintaining the same model size. The proposed methodology is rigorously evaluated for its feasibility and reliability on the benchmark SIRST and IRSTD-1k datasets. The experimental results indicate that the proposed methodology outperforms current state-of-the-art (SOTA) infrared small object detection techniques under conditions of complex environments, small infrared object scales, and the absence of discernible texture and shape features.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"149 ","pages":"Article 105851"},"PeriodicalIF":3.1000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449525001446","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}

引用次数: 0

Abstract

Infrared small object detection is widely used in small maneuvering object awareness and high threatening object detection and recognition. In recent years, the introduction of deep learning methods has greatly improved the detection performance of infrared small objects. However, the presence of clutter in infrared small object images (low signal-to-noise ratio, SNR) and the lack of shape and texture information for the objects lead to a decrease in detection performance in complex environments. As such, in this article, an infrared small object detection network based on decoupled feature learning and multi-scale feature fusion is proposed. First, utilizing disentangled feature learning, we construct Reversible Column Networks (Revcol) with C3 modules to get RevcolC3 to alleviate the issues of complex feature extraction and the loss of small-scale object information. Second, a new lighted attention spatial pyramid pooling (LASP) module is proposed. By convolving the features extracted from the backbone and performing two consecutive pooling operations, a large kernel separated attention (LSKA) mechanism is introduced to process spatial and channel information separately. This enhances the model’s multi-feature extraction capabilities while reducing computational complexity. Finally, a novel lightweight three-dimensional multi-scale feature fusion (LTDMF) module is designed to efficiently utilize the correlations between three-level pyramid feature maps and effectively extract infrared small object features. This enhances the network’s ability to detect objects while maintaining the same model size. The proposed methodology is rigorously evaluated for its feasibility and reliability on the benchmark SIRST and IRSTD-1k datasets. The experimental results indicate that the proposed methodology outperforms current state-of-the-art (SOTA) infrared small object detection techniques under conditions of complex environments, small infrared object scales, and the absence of discernible texture and shape features.

查看原文本刊更多论文

DFLMF-ISTD：基于解耦特征学习和多尺度特征融合的红外小目标检测网络

红外小目标检测广泛应用于小机动目标感知和高威胁目标的检测与识别。近年来，深度学习方法的引入大大提高了红外小目标的检测性能。然而，红外小目标图像中杂波的存在（低信噪比）和缺乏目标的形状和纹理信息，导致在复杂环境下检测性能下降。为此，本文提出了一种基于解耦特征学习和多尺度特征融合的红外小目标检测网络。首先，利用解纠缠特征学习，构建C3模块的可逆列网络（Revcol），以缓解特征提取复杂和小尺度目标信息丢失的问题；其次，提出了一种新的光注意力空间金字塔池（LASP）模块。通过卷积从主干提取的特征并执行两次连续池化操作，引入了一个大型核分离注意（LSKA）机制来分别处理空间信息和通道信息。这增强了模型的多特征提取能力，同时降低了计算复杂度。最后，设计了一种新型的轻量级三维多尺度特征融合（LTDMF）模块，有效地利用三层金字塔特征映射之间的相关性，有效地提取红外小目标特征。这增强了网络在保持相同模型尺寸的同时检测物体的能力。在基准SIRST和IRSTD-1k数据集上严格评估了该方法的可行性和可靠性。实验结果表明，该方法在复杂环境、红外小目标尺度以及缺乏可识别的纹理和形状特征的情况下，优于当前最先进的SOTA红外小目标检测技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Infrared Physics & Technology 物理-光学

CiteScore

5.70

自引率

12.10%

发文量

400

审稿时长

67 days

期刊介绍： The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region. Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine. Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.