Escaping Modal Interactions: An Efficient DESANet for Multi-Modal Object Re-Identification

IF 13.7
Wenjiao Dong;Xi Yang;De Cheng;Nannan Wang;Xinbo Gao
{"title":"Escaping Modal Interactions: An Efficient DESANet for Multi-Modal Object Re-Identification","authors":"Wenjiao Dong;Xi Yang;De Cheng;Nannan Wang;Xinbo Gao","doi":"10.1109/TIP.2025.3592575","DOIUrl":null,"url":null,"abstract":"Multi-modal object Re-ID aims to leverage the complementary information provided by multiple modalities to overcome challenging conditions and achieve high-quality object matching. However, existing multi-modal methods typically rely on various modality interaction modules for information fusion, which can reduce the efficiency of real-time monitoring systems. Additionally, practical challenges such as low-quality multi-modal data or missing modalities further complicate the application of object Re-ID. To address these issues, we propose the Complementary Data Enhancement and Modal-Aware Soft Alignment Network (DESANet), which is designed to be independent of interactive networks and adaptable to scenarios with missing modalities. This approach ensures a simple-yet-effective, and efficient multi-modal object Re-ID. DESANet consists of three key components: Firstly, the Dual-Color Space Data Enhancement (DCDE) module, which enhances multi-modal data by performing patch rotation in the RGB space and improving image quality in the HSV space. Secondly, the Salient Feature ReConstruction (SFRC) module, which addresses the issue of missing modalities by reconstructing features from one modality using the other two. Thirdly, the Modal-Aware Soft Alignment (MASA) module, which integrates multi-source data to avoid the blind fusion of features and prevents the propagation of noise from reconstructed modalities. Our approach achieves state-of-the-art performances on both person and vehicle datasets. Source code is available at <uri>https://github.com/DWJ11/DESANet</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"5068-5083"},"PeriodicalIF":13.7000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11104996/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Multi-modal object Re-ID aims to leverage the complementary information provided by multiple modalities to overcome challenging conditions and achieve high-quality object matching. However, existing multi-modal methods typically rely on various modality interaction modules for information fusion, which can reduce the efficiency of real-time monitoring systems. Additionally, practical challenges such as low-quality multi-modal data or missing modalities further complicate the application of object Re-ID. To address these issues, we propose the Complementary Data Enhancement and Modal-Aware Soft Alignment Network (DESANet), which is designed to be independent of interactive networks and adaptable to scenarios with missing modalities. This approach ensures a simple-yet-effective, and efficient multi-modal object Re-ID. DESANet consists of three key components: Firstly, the Dual-Color Space Data Enhancement (DCDE) module, which enhances multi-modal data by performing patch rotation in the RGB space and improving image quality in the HSV space. Secondly, the Salient Feature ReConstruction (SFRC) module, which addresses the issue of missing modalities by reconstructing features from one modality using the other two. Thirdly, the Modal-Aware Soft Alignment (MASA) module, which integrates multi-source data to avoid the blind fusion of features and prevents the propagation of noise from reconstructed modalities. Our approach achieves state-of-the-art performances on both person and vehicle datasets. Source code is available at https://github.com/DWJ11/DESANet
转义模态交互:多模态对象再识别的高效DESANet
多模态目标Re-ID旨在利用多模态提供的互补信息来克服困难条件,实现高质量的目标匹配。然而,现有的多模态方法往往依赖于各种模态交互模块进行信息融合,降低了实时监控系统的效率。此外,诸如低质量多模态数据或缺失模态等实际挑战使对象Re-ID的应用进一步复杂化。为了解决这些问题,我们提出了互补数据增强和模态感知软对齐网络(DESANet),该网络被设计为独立于交互网络并适应缺失模态的场景。这种方法确保了简单而有效的多模态对象Re-ID。DESANet由三个关键组件组成:首先是双色空间数据增强(DCDE)模块,该模块通过在RGB空间进行补丁旋转来增强多模态数据,并提高HSV空间的图像质量。其次是显著特征重建(SFRC)模块,该模块通过使用另两个模态从一个模态重建特征来解决缺失模态的问题。第三,模态感知软对齐(MASA)模块,该模块集成了多源数据,避免了特征的盲目融合,防止了重构模态噪声的传播。我们的方法在人和车辆数据集上都实现了最先进的性能。源代码可从https://github.com/DWJ11/DESANet获得
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信