{"title":"Segment anything model edge prior-guided infrared and visible image fusion method","authors":"Jianhua Yuan, Zhixian Tan, Kaixiang Xu, Zihan Wang, Jianan Zhang","doi":"10.1016/j.infrared.2025.106013","DOIUrl":null,"url":null,"abstract":"<div><div>Hindered by the low robustness of visible imaging and the low resolution of infrared imaging, visible and infrared images captured in real scenarios often suffer from severe detail loss and indistinct edge appearance. Therefore, modeling of fine-grained edges is crucial for high-quality fused image generation. However, recent deep learning-based fusion algorithms have limitations in exploring highly robust structural priors, in other words, limited edge perception ability, resulting in edge artifacts and low clarity. To address the aforementioned issues, we propose a novel fusion framework guided by segment anything model (SAM) edge priors. SAM boasts outstanding zero-shot generalization capabilities, enabling it to extract high-quality edge priors from target scenes even under unideal imaging conditions, such as low-light and dense noise. We propose an image content-edge fusion block (CEFB), which progressively injects edge information from the source images into the image content features, to enhance their edge representation ability. Additionally, considering the significance of inter-modality interaction, we also introduced an image translation network, achieving mutual translation between infrared and visible modalities. Then, the SAM edge priors extracted from the transformed images, which have unchanged content but altered style, are utilized and embedded into the content features after modality interaction through the proposed CEFB, thus strengthening the representation of modality-invariant edge structures. Extensive experiments on two public datasets demonstrate our method generates fused images with more distinct edges and enhanced target information, and also exhibits strong generalization capabilities across diverse scenarios.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"150 ","pages":"Article 106013"},"PeriodicalIF":3.4000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449525003068","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0
Abstract
Hindered by the low robustness of visible imaging and the low resolution of infrared imaging, visible and infrared images captured in real scenarios often suffer from severe detail loss and indistinct edge appearance. Therefore, modeling of fine-grained edges is crucial for high-quality fused image generation. However, recent deep learning-based fusion algorithms have limitations in exploring highly robust structural priors, in other words, limited edge perception ability, resulting in edge artifacts and low clarity. To address the aforementioned issues, we propose a novel fusion framework guided by segment anything model (SAM) edge priors. SAM boasts outstanding zero-shot generalization capabilities, enabling it to extract high-quality edge priors from target scenes even under unideal imaging conditions, such as low-light and dense noise. We propose an image content-edge fusion block (CEFB), which progressively injects edge information from the source images into the image content features, to enhance their edge representation ability. Additionally, considering the significance of inter-modality interaction, we also introduced an image translation network, achieving mutual translation between infrared and visible modalities. Then, the SAM edge priors extracted from the transformed images, which have unchanged content but altered style, are utilized and embedded into the content features after modality interaction through the proposed CEFB, thus strengthening the representation of modality-invariant edge structures. Extensive experiments on two public datasets demonstrate our method generates fused images with more distinct edges and enhanced target information, and also exhibits strong generalization capabilities across diverse scenarios.
期刊介绍:
The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region.
Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine.
Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.