{"title":"DFLMF-ISTD: Infrared small object detection network based on decoupled feature learning and multi-scale feature fusion","authors":"Ning Li, Daozhi Wei, Shucai Huang, Xirui Xue","doi":"10.1016/j.infrared.2025.105851","DOIUrl":null,"url":null,"abstract":"<div><div>Infrared small object detection is widely used in small maneuvering object awareness and high threatening object detection and recognition. In recent years, the introduction of deep learning methods has greatly improved the detection performance of infrared small objects. However, the presence of clutter in infrared small object images (low signal-to-noise ratio, SNR) and the lack of shape and texture information for the objects lead to a decrease in detection performance in complex environments. As such, in this article, an infrared small object detection network based on decoupled feature learning and multi-scale feature fusion is proposed. First, utilizing disentangled feature learning, we construct Reversible Column Networks (Revcol) with C3 modules to get RevcolC3 to alleviate the issues of complex feature extraction and the loss of small-scale object information. Second, a new lighted attention spatial pyramid pooling (LASP) module is proposed. By convolving the features extracted from the backbone and performing two consecutive pooling operations, a large kernel separated attention (LSKA) mechanism is introduced to process spatial and channel information separately. This enhances the model’s multi-feature extraction capabilities while reducing computational complexity. Finally, a novel lightweight three-dimensional multi-scale feature fusion (LTDMF) module is designed to efficiently utilize the correlations between three-level pyramid feature maps and effectively extract infrared small object features. This enhances the network’s ability to detect objects while maintaining the same model size. The proposed methodology is rigorously evaluated for its feasibility and reliability on the benchmark SIRST and IRSTD-1k datasets. The experimental results indicate that the proposed methodology outperforms current state-of-the-art (SOTA) infrared small object detection techniques under conditions of complex environments, small infrared object scales, and the absence of discernible texture and shape features.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"149 ","pages":"Article 105851"},"PeriodicalIF":3.1000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449525001446","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0
Abstract
Infrared small object detection is widely used in small maneuvering object awareness and high threatening object detection and recognition. In recent years, the introduction of deep learning methods has greatly improved the detection performance of infrared small objects. However, the presence of clutter in infrared small object images (low signal-to-noise ratio, SNR) and the lack of shape and texture information for the objects lead to a decrease in detection performance in complex environments. As such, in this article, an infrared small object detection network based on decoupled feature learning and multi-scale feature fusion is proposed. First, utilizing disentangled feature learning, we construct Reversible Column Networks (Revcol) with C3 modules to get RevcolC3 to alleviate the issues of complex feature extraction and the loss of small-scale object information. Second, a new lighted attention spatial pyramid pooling (LASP) module is proposed. By convolving the features extracted from the backbone and performing two consecutive pooling operations, a large kernel separated attention (LSKA) mechanism is introduced to process spatial and channel information separately. This enhances the model’s multi-feature extraction capabilities while reducing computational complexity. Finally, a novel lightweight three-dimensional multi-scale feature fusion (LTDMF) module is designed to efficiently utilize the correlations between three-level pyramid feature maps and effectively extract infrared small object features. This enhances the network’s ability to detect objects while maintaining the same model size. The proposed methodology is rigorously evaluated for its feasibility and reliability on the benchmark SIRST and IRSTD-1k datasets. The experimental results indicate that the proposed methodology outperforms current state-of-the-art (SOTA) infrared small object detection techniques under conditions of complex environments, small infrared object scales, and the absence of discernible texture and shape features.
期刊介绍:
The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region.
Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine.
Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.