{"title":"用于红外小目标探测的轻型目标全向增强网络","authors":"Yichuan Li, Feng He, Qiran Zhang, Wei Zhang","doi":"10.1016/j.infrared.2025.106058","DOIUrl":null,"url":null,"abstract":"<div><div>Due to the limited number of pixels and weak features of small targets in infrared images, detecting such targets in complex backgrounds remains a highly challenging task. It is worthwhile to explore how prior knowledge can be used to compensate for the insufficient inherent information in the original images, thereby assisting deep learning methods in learning more effectively. Inspired by human visual perception, areas with greater local changes tend to attract more attention. In infrared images, while there is some grayscale gradient at the boundary between small targets and the background, background regions also exhibit grayscale variations.To address these issues and make better use of grayscale gradient information as prior knowledge, it is necessary to distinguish the gradients around small targets from those in complex background regions. Therefore, we propose a Target Omnidirectional Enhancement Network (TODENet). The network first uses a Target Enhancement Module to focus on the inherent prior knowledge of infrared images, amplifying the grayscale gradient at the boundary between small targets and the background, while suppressing gradient variations within the background. This approach reduces clutter interference from complex backgrounds and highlights small targets within the image. Building on this, we constructed an Inter-layer Feature Fusion Module based on transposed convolution, which effectively minimizes the loss of high-frequency information of small targets during upsampling. It also makes full use of the semantic information from deep feature maps and the spatial location information from shallow feature maps. Additionally, we developed a Dilated Convolution Module that adjusts the receptive field size to filter out background clutter and then extract fine features of small targets, addressing the problem of losing small target features in the deeper layers of network. Extensive experiments show that TODENet achieves state-of-the-art performance on the NUAA-SIRST, NUDT-SIRST, and IRSTD-1k datasets, with target-level detection rates (Pd) of 97.710%, 99.649%, and 94.218%, respectively. The source code of our work is available at <span><span>https://github.com/LYC-1021/TODE-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"151 ","pages":"Article 106058"},"PeriodicalIF":3.4000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Lightweight Target Omni-Directional Enhancement Network for infrared small target detection\",\"authors\":\"Yichuan Li, Feng He, Qiran Zhang, Wei Zhang\",\"doi\":\"10.1016/j.infrared.2025.106058\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Due to the limited number of pixels and weak features of small targets in infrared images, detecting such targets in complex backgrounds remains a highly challenging task. It is worthwhile to explore how prior knowledge can be used to compensate for the insufficient inherent information in the original images, thereby assisting deep learning methods in learning more effectively. Inspired by human visual perception, areas with greater local changes tend to attract more attention. In infrared images, while there is some grayscale gradient at the boundary between small targets and the background, background regions also exhibit grayscale variations.To address these issues and make better use of grayscale gradient information as prior knowledge, it is necessary to distinguish the gradients around small targets from those in complex background regions. Therefore, we propose a Target Omnidirectional Enhancement Network (TODENet). The network first uses a Target Enhancement Module to focus on the inherent prior knowledge of infrared images, amplifying the grayscale gradient at the boundary between small targets and the background, while suppressing gradient variations within the background. This approach reduces clutter interference from complex backgrounds and highlights small targets within the image. Building on this, we constructed an Inter-layer Feature Fusion Module based on transposed convolution, which effectively minimizes the loss of high-frequency information of small targets during upsampling. It also makes full use of the semantic information from deep feature maps and the spatial location information from shallow feature maps. Additionally, we developed a Dilated Convolution Module that adjusts the receptive field size to filter out background clutter and then extract fine features of small targets, addressing the problem of losing small target features in the deeper layers of network. Extensive experiments show that TODENet achieves state-of-the-art performance on the NUAA-SIRST, NUDT-SIRST, and IRSTD-1k datasets, with target-level detection rates (Pd) of 97.710%, 99.649%, and 94.218%, respectively. The source code of our work is available at <span><span>https://github.com/LYC-1021/TODE-Net</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":13549,\"journal\":{\"name\":\"Infrared Physics & Technology\",\"volume\":\"151 \",\"pages\":\"Article 106058\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Infrared Physics & Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1350449525003512\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INSTRUMENTS & INSTRUMENTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449525003512","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
Lightweight Target Omni-Directional Enhancement Network for infrared small target detection
Due to the limited number of pixels and weak features of small targets in infrared images, detecting such targets in complex backgrounds remains a highly challenging task. It is worthwhile to explore how prior knowledge can be used to compensate for the insufficient inherent information in the original images, thereby assisting deep learning methods in learning more effectively. Inspired by human visual perception, areas with greater local changes tend to attract more attention. In infrared images, while there is some grayscale gradient at the boundary between small targets and the background, background regions also exhibit grayscale variations.To address these issues and make better use of grayscale gradient information as prior knowledge, it is necessary to distinguish the gradients around small targets from those in complex background regions. Therefore, we propose a Target Omnidirectional Enhancement Network (TODENet). The network first uses a Target Enhancement Module to focus on the inherent prior knowledge of infrared images, amplifying the grayscale gradient at the boundary between small targets and the background, while suppressing gradient variations within the background. This approach reduces clutter interference from complex backgrounds and highlights small targets within the image. Building on this, we constructed an Inter-layer Feature Fusion Module based on transposed convolution, which effectively minimizes the loss of high-frequency information of small targets during upsampling. It also makes full use of the semantic information from deep feature maps and the spatial location information from shallow feature maps. Additionally, we developed a Dilated Convolution Module that adjusts the receptive field size to filter out background clutter and then extract fine features of small targets, addressing the problem of losing small target features in the deeper layers of network. Extensive experiments show that TODENet achieves state-of-the-art performance on the NUAA-SIRST, NUDT-SIRST, and IRSTD-1k datasets, with target-level detection rates (Pd) of 97.710%, 99.649%, and 94.218%, respectively. The source code of our work is available at https://github.com/LYC-1021/TODE-Net.
期刊介绍:
The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region.
Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine.
Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.