Yangjun Pi , Lingchuan Kong , Bo Yang , Rui Chang , Huayan Pu , Mingliang Zhou , Jun Luo
{"title":"IRTransUNet:用于红外小目标检测的高效变压器嵌入UNet","authors":"Yangjun Pi , Lingchuan Kong , Bo Yang , Rui Chang , Huayan Pu , Mingliang Zhou , Jun Luo","doi":"10.1016/j.infrared.2025.106061","DOIUrl":null,"url":null,"abstract":"<div><div>Infrared small target detection is of critical importance in the field of security. However, the inherent weak features and low signal-to-noise ratio of such targets make it particularly difficult to detect them effectively in cluttered and complex backgrounds. To address this issue, this paper proposes IRTransUNet, which integrates local and global information to more thoroughly exploit the differences between the target and the background, thereby achieving more effective discrimination. First, we design a robust feature extractor (RFE), a lightweight and efficient module that leverages a larger contextual receptive field to extract more discriminative fine-grained features. Next, we introduce the IRconvformer module, which focuses on capturing global dependencies and modeling the relationship between the target and background. Specifically, we enhance the target boundary features within tokens using atrous spatial embedding (ASE) and replace the self-attention mechanism with multi-slice linear attention (MSLA), allowing for more efficient global modeling and focused target feature extraction. Additionally, we incorporate a convolutional gated feedforward network (CGFN) to improve the feedforward network, adjusting the information flow between neighboring pixels, thus maintaining the model’s ability to perceive local features. Finally, extensive experiments on four widely used datasets demonstrate that IRTransUNet achieves state-of-the-art performance in infrared small target detection. The code will be publicly available at <span><span>https://github.com/LingchuanK/IRTransUnet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"151 ","pages":"Article 106061"},"PeriodicalIF":3.4000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"IRTransUNet: Efficient transformer embedding UNet for infrared small target detection\",\"authors\":\"Yangjun Pi , Lingchuan Kong , Bo Yang , Rui Chang , Huayan Pu , Mingliang Zhou , Jun Luo\",\"doi\":\"10.1016/j.infrared.2025.106061\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Infrared small target detection is of critical importance in the field of security. However, the inherent weak features and low signal-to-noise ratio of such targets make it particularly difficult to detect them effectively in cluttered and complex backgrounds. To address this issue, this paper proposes IRTransUNet, which integrates local and global information to more thoroughly exploit the differences between the target and the background, thereby achieving more effective discrimination. First, we design a robust feature extractor (RFE), a lightweight and efficient module that leverages a larger contextual receptive field to extract more discriminative fine-grained features. Next, we introduce the IRconvformer module, which focuses on capturing global dependencies and modeling the relationship between the target and background. Specifically, we enhance the target boundary features within tokens using atrous spatial embedding (ASE) and replace the self-attention mechanism with multi-slice linear attention (MSLA), allowing for more efficient global modeling and focused target feature extraction. Additionally, we incorporate a convolutional gated feedforward network (CGFN) to improve the feedforward network, adjusting the information flow between neighboring pixels, thus maintaining the model’s ability to perceive local features. Finally, extensive experiments on four widely used datasets demonstrate that IRTransUNet achieves state-of-the-art performance in infrared small target detection. The code will be publicly available at <span><span>https://github.com/LingchuanK/IRTransUnet</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":13549,\"journal\":{\"name\":\"Infrared Physics & Technology\",\"volume\":\"151 \",\"pages\":\"Article 106061\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Infrared Physics & Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1350449525003548\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INSTRUMENTS & INSTRUMENTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449525003548","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
IRTransUNet: Efficient transformer embedding UNet for infrared small target detection
Infrared small target detection is of critical importance in the field of security. However, the inherent weak features and low signal-to-noise ratio of such targets make it particularly difficult to detect them effectively in cluttered and complex backgrounds. To address this issue, this paper proposes IRTransUNet, which integrates local and global information to more thoroughly exploit the differences between the target and the background, thereby achieving more effective discrimination. First, we design a robust feature extractor (RFE), a lightweight and efficient module that leverages a larger contextual receptive field to extract more discriminative fine-grained features. Next, we introduce the IRconvformer module, which focuses on capturing global dependencies and modeling the relationship between the target and background. Specifically, we enhance the target boundary features within tokens using atrous spatial embedding (ASE) and replace the self-attention mechanism with multi-slice linear attention (MSLA), allowing for more efficient global modeling and focused target feature extraction. Additionally, we incorporate a convolutional gated feedforward network (CGFN) to improve the feedforward network, adjusting the information flow between neighboring pixels, thus maintaining the model’s ability to perceive local features. Finally, extensive experiments on four widely used datasets demonstrate that IRTransUNet achieves state-of-the-art performance in infrared small target detection. The code will be publicly available at https://github.com/LingchuanK/IRTransUnet.
期刊介绍:
The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region.
Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine.
Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.