改进多光谱目标检测的早期融合策略再思考

IF 14.3 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Intelligent Vehicles Pub Date : 2024-09-17 DOI:10.1109/TIV.2024.3462488

Xue Zhang;Si-Yuan Cao;Fang Wang;Runmin Zhang;Zhe Wu;Xiaohan Zhang;Xiaokai Bai;Hui-Liang Shen

{"title":"改进多光谱目标检测的早期融合策略再思考","authors":"Xue Zhang;Si-Yuan Cao;Fang Wang;Runmin Zhang;Zhe Wu;Xiaohan Zhang;Xiaokai Bai;Hui-Liang Shen","doi":"10.1109/TIV.2024.3462488","DOIUrl":null,"url":null,"abstract":"Most recent multispectral object detectors employ a two-branch structure to extract features from RGB and thermal images. While the two-branch structure achieves better performance than a single-branch structure, it overlooks inference efficiency. This conflict is increasingly aggressive, as recent works solely pursue higher performance rather than both performance and efficiency. In this paper, we address this issue by improving the performance of efficient single-branch structures. We revisit the reasons causing the performance gap between these structures. For the first time, we reveal the information interference problem in the naive early-fusion strategy adopted by previous single-branch structures. Besides, we find that the domain gap between multispectral images, and weak feature representation of the single-branch structure are also key obstacles for performance. Focusing on these three problems, we propose corresponding solutions, including a novel shape-priority early-fusion strategy, a weakly supervised learning method, and a core knowledge distillation technique. Experiments demonstrate that single-branch networks equipped with these three contributions achieve significant performance enhancements while retaining high efficiency.","PeriodicalId":36532,"journal":{"name":"IEEE Transactions on Intelligent Vehicles","volume":"10 6","pages":"3728-3742"},"PeriodicalIF":14.3000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rethinking Early-Fusion Strategies for Improved Multispectral Object Detection\",\"authors\":\"Xue Zhang;Si-Yuan Cao;Fang Wang;Runmin Zhang;Zhe Wu;Xiaohan Zhang;Xiaokai Bai;Hui-Liang Shen\",\"doi\":\"10.1109/TIV.2024.3462488\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most recent multispectral object detectors employ a two-branch structure to extract features from RGB and thermal images. While the two-branch structure achieves better performance than a single-branch structure, it overlooks inference efficiency. This conflict is increasingly aggressive, as recent works solely pursue higher performance rather than both performance and efficiency. In this paper, we address this issue by improving the performance of efficient single-branch structures. We revisit the reasons causing the performance gap between these structures. For the first time, we reveal the information interference problem in the naive early-fusion strategy adopted by previous single-branch structures. Besides, we find that the domain gap between multispectral images, and weak feature representation of the single-branch structure are also key obstacles for performance. Focusing on these three problems, we propose corresponding solutions, including a novel shape-priority early-fusion strategy, a weakly supervised learning method, and a core knowledge distillation technique. Experiments demonstrate that single-branch networks equipped with these three contributions achieve significant performance enhancements while retaining high efficiency.\",\"PeriodicalId\":36532,\"journal\":{\"name\":\"IEEE Transactions on Intelligent Vehicles\",\"volume\":\"10 6\",\"pages\":\"3728-3742\"},\"PeriodicalIF\":14.3000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Intelligent Vehicles\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10681477/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Vehicles","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10681477/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

最近的多光谱目标探测器采用双分支结构从RGB和热图像中提取特征。虽然双分支结构比单分支结构具有更好的性能，但忽略了推理效率。这种冲突越来越激烈，因为最近的作品只追求更高的性能，而不是性能和效率。在本文中，我们通过提高高效的单分支结构的性能来解决这个问题。我们将重新审视导致这些结构之间性能差距的原因。首次揭示了以往单分支结构采用的朴素早期融合策略中存在的信息干扰问题。此外，我们发现多光谱图像之间的域间隙和单分支结构的弱特征表示也是影响性能的关键障碍。针对这三个问题，我们提出了相应的解决方案，包括一种新的形状优先级早期融合策略、一种弱监督学习方法和一种核心知识蒸馏技术。实验表明，配备这三种贡献的单分支网络在保持高效率的同时获得了显着的性能增强。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Rethinking Early-Fusion Strategies for Improved Multispectral Object Detection

Most recent multispectral object detectors employ a two-branch structure to extract features from RGB and thermal images. While the two-branch structure achieves better performance than a single-branch structure, it overlooks inference efficiency. This conflict is increasingly aggressive, as recent works solely pursue higher performance rather than both performance and efficiency. In this paper, we address this issue by improving the performance of efficient single-branch structures. We revisit the reasons causing the performance gap between these structures. For the first time, we reveal the information interference problem in the naive early-fusion strategy adopted by previous single-branch structures. Besides, we find that the domain gap between multispectral images, and weak feature representation of the single-branch structure are also key obstacles for performance. Focusing on these three problems, we propose corresponding solutions, including a novel shape-priority early-fusion strategy, a weakly supervised learning method, and a core knowledge distillation technique. Experiments demonstrate that single-branch networks equipped with these three contributions achieve significant performance enhancements while retaining high efficiency.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Intelligent Vehicles Mathematics-Control and Optimization

CiteScore

12.10

自引率

13.40%

发文量

177

期刊介绍： The IEEE Transactions on Intelligent Vehicles (T-IV) is a premier platform for publishing peer-reviewed articles that present innovative research concepts, application results, significant theoretical findings, and application case studies in the field of intelligent vehicles. With a particular emphasis on automated vehicles within roadway environments, T-IV aims to raise awareness of pressing research and application challenges. Our focus is on providing critical information to the intelligent vehicle community, serving as a dissemination vehicle for IEEE ITS Society members and others interested in learning about the state-of-the-art developments and progress in research and applications related to intelligent vehicles. Join us in advancing knowledge and innovation in this dynamic field.