可调可见光和红外图像融合

IF 8.3 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Boxiong Wu;Jiangtao Nie;Wei Wei;Lei Zhang;Yanning Zhang
{"title":"可调可见光和红外图像融合","authors":"Boxiong Wu;Jiangtao Nie;Wei Wei;Lei Zhang;Yanning Zhang","doi":"10.1109/TCSVT.2024.3449638","DOIUrl":null,"url":null,"abstract":"The visible and infrared image fusion (VIF) method aims to utilize the complementary information between these two modalities to synthesize a new image containing richer information. Although it has been extensively studied, the synthesized image that has the best visual results is difficult to reach consensus since users have different opinions. To address this problem, we propose an adjustable VIF framework termed AdjFusion, which introduces a global controlling coefficient into VIF to enforce it can interact with users. Within AdjFusion, a semantic-aware modulation module is proposed to transform the global controlling coefficient into a semantic-aware controlling coefficient, which provides pixel-wise guidance for AdjFusion considering both interactivity and semantic information within visible and infrared images. In addition, the introduced global controlling coefficient not only can be utilized as an external interface for interaction with users but also can be easily customized by the downstream tasks (e.g., VIF-based detection and segmentation), which can help to select the best fusion result for the downstream tasks. Taking advantage of this, we further propose a lightweight adaptation module for AdjFusion to learn the global controlling coefficient to be suitable for the downstream tasks better. Experimental results demonstrate the proposed AdjFusion can 1) provide ways to dynamically synthesize images to meet the diverse demands of users; and 2) outperform the previous state-of-the-art methods on both VIF-based detection and segmentation tasks, with the constructed lightweight adaptation method. Our code will be released after accepted at \n<uri>https://github.com/BearTo2/AdjFusion</uri>\n.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"34 12","pages":"13463-13477"},"PeriodicalIF":8.3000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adjustable Visible and Infrared Image Fusion\",\"authors\":\"Boxiong Wu;Jiangtao Nie;Wei Wei;Lei Zhang;Yanning Zhang\",\"doi\":\"10.1109/TCSVT.2024.3449638\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The visible and infrared image fusion (VIF) method aims to utilize the complementary information between these two modalities to synthesize a new image containing richer information. Although it has been extensively studied, the synthesized image that has the best visual results is difficult to reach consensus since users have different opinions. To address this problem, we propose an adjustable VIF framework termed AdjFusion, which introduces a global controlling coefficient into VIF to enforce it can interact with users. Within AdjFusion, a semantic-aware modulation module is proposed to transform the global controlling coefficient into a semantic-aware controlling coefficient, which provides pixel-wise guidance for AdjFusion considering both interactivity and semantic information within visible and infrared images. In addition, the introduced global controlling coefficient not only can be utilized as an external interface for interaction with users but also can be easily customized by the downstream tasks (e.g., VIF-based detection and segmentation), which can help to select the best fusion result for the downstream tasks. Taking advantage of this, we further propose a lightweight adaptation module for AdjFusion to learn the global controlling coefficient to be suitable for the downstream tasks better. Experimental results demonstrate the proposed AdjFusion can 1) provide ways to dynamically synthesize images to meet the diverse demands of users; and 2) outperform the previous state-of-the-art methods on both VIF-based detection and segmentation tasks, with the constructed lightweight adaptation method. Our code will be released after accepted at \\n<uri>https://github.com/BearTo2/AdjFusion</uri>\\n.\",\"PeriodicalId\":13082,\"journal\":{\"name\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"volume\":\"34 12\",\"pages\":\"13463-13477\"},\"PeriodicalIF\":8.3000,\"publicationDate\":\"2024-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10646495/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10646495/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

可见光和红外图像融合(VIF)方法的目的是利用可见光和红外图像之间的互补信息合成包含更丰富信息的新图像。虽然已经有了广泛的研究,但由于用户有不同的意见,因此具有最佳视觉效果的合成图像很难达成共识。为了解决这个问题,我们提出了一个可调节的VIF框架,称为AdjFusion,它在VIF中引入了一个全局控制系数,以强制它可以与用户交互。在AdjFusion中,提出了一个语义感知的调制模块,将全局控制系数转换为语义感知的控制系数,为考虑可见光和红外图像的交互性和语义信息的AdjFusion提供像素级指导。此外,引入的全局控制系数不仅可以作为与用户交互的外部接口,还可以被下游任务(如基于vif的检测和分割)轻松定制,有助于为下游任务选择最佳融合结果。利用这一点,我们进一步提出了一个轻量级的自适应模块,用于AdjFusion学习全局控制系数,以更好地适应下游任务。实验结果表明,该算法能够1)提供动态合成图像的方法,以满足用户的多样化需求;2)利用构建的轻量级自适应方法,在基于vif的检测和分割任务上都优于现有的先进方法。我们的代码将在https://github.com/BearTo2/AdjFusion上被接受后发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Adjustable Visible and Infrared Image Fusion
The visible and infrared image fusion (VIF) method aims to utilize the complementary information between these two modalities to synthesize a new image containing richer information. Although it has been extensively studied, the synthesized image that has the best visual results is difficult to reach consensus since users have different opinions. To address this problem, we propose an adjustable VIF framework termed AdjFusion, which introduces a global controlling coefficient into VIF to enforce it can interact with users. Within AdjFusion, a semantic-aware modulation module is proposed to transform the global controlling coefficient into a semantic-aware controlling coefficient, which provides pixel-wise guidance for AdjFusion considering both interactivity and semantic information within visible and infrared images. In addition, the introduced global controlling coefficient not only can be utilized as an external interface for interaction with users but also can be easily customized by the downstream tasks (e.g., VIF-based detection and segmentation), which can help to select the best fusion result for the downstream tasks. Taking advantage of this, we further propose a lightweight adaptation module for AdjFusion to learn the global controlling coefficient to be suitable for the downstream tasks better. Experimental results demonstrate the proposed AdjFusion can 1) provide ways to dynamically synthesize images to meet the diverse demands of users; and 2) outperform the previous state-of-the-art methods on both VIF-based detection and segmentation tasks, with the constructed lightweight adaptation method. Our code will be released after accepted at https://github.com/BearTo2/AdjFusion .
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
13.80
自引率
27.40%
发文量
660
审稿时长
5 months
期刊介绍: The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信