{"title":"SMALNet:用于红外图像分段的分段任何模型辅助轻量级网络","authors":"Kun Ding , Shiming Xiang , Chunhong Pan","doi":"10.1016/j.infrared.2024.105540","DOIUrl":null,"url":null,"abstract":"<div><p>Infrared based visual perception is important for night vision of autonomous vehicles, unmanned aerial vehicles (UAVs), etc. Semantic segmentation based on deep learning is one of the key techniques for infrared vision-based perception systems. Currently, most of the advanced methods are based on Transformers, which can achieve favorable segmentation accuracy. However, the high complexity of Transformers prevents them from meeting the real-time requirement of inference speed in resource constrained applications. In view of this, we suggest several lightweight designs that significantly reduce existing computational complexity. In order to maintain the segmentation accuracy, we further introduce the recent vision big model — Segment Anything Model (SAM) to supply auxiliary supervisory signals while training models. Based on these designs, we propose a lightweight segmentation network termed SMALNet (<u>S</u>egment Anything <u>M</u>odel <u>A</u>ided <u>L</u>ightweight <u>N</u>etwork). Compared to existing state-of-the-art method, SegFormer, it reduces 64% FLOPs while maintaining the accuracy to a large extent on two commonly-used benchmarks. The proposed SMALNet can be used in various infrared based vision perception systems with limited hardware resources.</p></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SMALNet: Segment Anything Model Aided Lightweight Network for Infrared Image Segmentation\",\"authors\":\"Kun Ding , Shiming Xiang , Chunhong Pan\",\"doi\":\"10.1016/j.infrared.2024.105540\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Infrared based visual perception is important for night vision of autonomous vehicles, unmanned aerial vehicles (UAVs), etc. Semantic segmentation based on deep learning is one of the key techniques for infrared vision-based perception systems. Currently, most of the advanced methods are based on Transformers, which can achieve favorable segmentation accuracy. However, the high complexity of Transformers prevents them from meeting the real-time requirement of inference speed in resource constrained applications. In view of this, we suggest several lightweight designs that significantly reduce existing computational complexity. In order to maintain the segmentation accuracy, we further introduce the recent vision big model — Segment Anything Model (SAM) to supply auxiliary supervisory signals while training models. Based on these designs, we propose a lightweight segmentation network termed SMALNet (<u>S</u>egment Anything <u>M</u>odel <u>A</u>ided <u>L</u>ightweight <u>N</u>etwork). Compared to existing state-of-the-art method, SegFormer, it reduces 64% FLOPs while maintaining the accuracy to a large extent on two commonly-used benchmarks. The proposed SMALNet can be used in various infrared based vision perception systems with limited hardware resources.</p></div>\",\"PeriodicalId\":13549,\"journal\":{\"name\":\"Infrared Physics & Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Infrared Physics & Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1350449524004249\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INSTRUMENTS & INSTRUMENTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449524004249","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0
摘要
基于红外的视觉感知对于自动驾驶车辆、无人驾驶飞行器(UAV)等的夜视非常重要。基于深度学习的语义分割是基于红外视觉的感知系统的关键技术之一。目前,大多数先进方法都是基于变换器(Transformers),这种方法可以达到较高的分割精度。然而,变换器的高复杂性使其无法满足资源有限的应用领域对推理速度的实时要求。有鉴于此,我们提出了几种轻量级设计,大大降低了现有的计算复杂度。为了保持分割的准确性,我们进一步引入了最新的视觉大模型 - Segment Anything Model (SAM),在训练模型时提供辅助监督信号。基于这些设计,我们提出了一种轻量级分割网络,称为 SMALNet(Segment Anything Model Aided Lightweight Network)。与现有的最先进方法 SegFormer 相比,它减少了 64% 的 FLOPs,同时在两个常用基准上很大程度上保持了准确性。提出的 SMALNet 可用于硬件资源有限的各种红外视觉感知系统。
SMALNet: Segment Anything Model Aided Lightweight Network for Infrared Image Segmentation
Infrared based visual perception is important for night vision of autonomous vehicles, unmanned aerial vehicles (UAVs), etc. Semantic segmentation based on deep learning is one of the key techniques for infrared vision-based perception systems. Currently, most of the advanced methods are based on Transformers, which can achieve favorable segmentation accuracy. However, the high complexity of Transformers prevents them from meeting the real-time requirement of inference speed in resource constrained applications. In view of this, we suggest several lightweight designs that significantly reduce existing computational complexity. In order to maintain the segmentation accuracy, we further introduce the recent vision big model — Segment Anything Model (SAM) to supply auxiliary supervisory signals while training models. Based on these designs, we propose a lightweight segmentation network termed SMALNet (Segment Anything Model Aided Lightweight Network). Compared to existing state-of-the-art method, SegFormer, it reduces 64% FLOPs while maintaining the accuracy to a large extent on two commonly-used benchmarks. The proposed SMALNet can be used in various infrared based vision perception systems with limited hardware resources.
期刊介绍:
The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region.
Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine.
Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.