Single-Domain Generalized Object Detection With Frequency Whitening and Contrastive Learning

IF 9.7 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Multimedia Pub Date : 2025-07-21 DOI:10.1109/TMM.2025.3590915

Xiaolong Guo;Chengxu Liu;Xueming Qian;Zhixiao Wang;Xubin Feng;Yao Xue

{"title":"Single-Domain Generalized Object Detection With Frequency Whitening and Contrastive Learning","authors":"Xiaolong Guo;Chengxu Liu;Xueming Qian;Zhixiao Wang;Xubin Feng;Yao Xue","doi":"10.1109/TMM.2025.3590915","DOIUrl":null,"url":null,"abstract":"Single-Domain Generalization Object Detection (Single-DGOD) refers to training a model with only one source domain, enabling the model to generalize to any unseen domain. For instance, a detector trained on a sunny daytime dataset should also perform well in scenarios such as rainy nighttime. The main challenge is to improve the detector’s ability to learn the domain-invariant representation (DIR) while removing domain-specific information. Recent progress in Single-DGOD has demonstrated the efficacy of removing domain-specific information by adjusting feature distributions. Nonetheless, simply adjusting the global feature distribution in Single-DGOD task is insufficient to learn the potential relationship from sunny to adverse weather, as these ignore the significant domain gaps between instances across different weathers. In this paper, we propose a novel object detection method for more robust single-domain generalization. In particular, it mainly consists of a frequency-aware selective whitening module (FSW) for removing redundant domain-specific information and a contrastive feature alignment module (CFA) for enhancing domain-invariant information among instances. Specially, FSW extracts the magnitude spectrum of the feature and uses a group whitening loss to selectively eliminate redundant domain-specific information in the magnitude. To further eliminate domain differences among instances, we apply the style transfer method for data augmentation and use the augmented data in the CFA module. CFA formulates both the original and the augmentd RoI features into a series of groups with different categories, and utilizes contrastive learning across them to facilitate the learning of DIR in various categories. Experiments show that our method achieves favorable performance on existing standard benchmarks.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"6805-6818"},"PeriodicalIF":9.7000,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11086412/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Single-Domain Generalization Object Detection (Single-DGOD) refers to training a model with only one source domain, enabling the model to generalize to any unseen domain. For instance, a detector trained on a sunny daytime dataset should also perform well in scenarios such as rainy nighttime. The main challenge is to improve the detector’s ability to learn the domain-invariant representation (DIR) while removing domain-specific information. Recent progress in Single-DGOD has demonstrated the efficacy of removing domain-specific information by adjusting feature distributions. Nonetheless, simply adjusting the global feature distribution in Single-DGOD task is insufficient to learn the potential relationship from sunny to adverse weather, as these ignore the significant domain gaps between instances across different weathers. In this paper, we propose a novel object detection method for more robust single-domain generalization. In particular, it mainly consists of a frequency-aware selective whitening module (FSW) for removing redundant domain-specific information and a contrastive feature alignment module (CFA) for enhancing domain-invariant information among instances. Specially, FSW extracts the magnitude spectrum of the feature and uses a group whitening loss to selectively eliminate redundant domain-specific information in the magnitude. To further eliminate domain differences among instances, we apply the style transfer method for data augmentation and use the augmented data in the CFA module. CFA formulates both the original and the augmentd RoI features into a series of groups with different categories, and utilizes contrastive learning across them to facilitate the learning of DIR in various categories. Experiments show that our method achieves favorable performance on existing standard benchmarks.

查看原文本刊更多论文

基于频率白化和对比学习的单域广义目标检测

单域泛化对象检测（Single-Domain Generalization Object Detection, Single-DGOD）是指只训练一个源域的模型，使模型泛化到任何不可见的域。例如，在阳光明媚的白天数据集上训练的检测器也应该在下雨的夜晚等场景中表现良好。主要的挑战是在删除特定于领域的信息的同时提高检测器学习领域不变表示（DIR）的能力。最近在Single-DGOD方面的进展已经证明了通过调整特征分布来去除特定领域信息的有效性。然而，简单地调整Single-DGOD任务中的全局特征分布不足以了解晴天与恶劣天气之间的潜在关系，因为这些任务忽略了不同天气下实例之间的显著域间隙。在本文中，我们提出了一种新的目标检测方法，以提高单域泛化的鲁棒性。该算法主要由去除冗余域特定信息的频率感知选择性白化模块（FSW）和增强实例间域不变信息的对比特征对齐模块（CFA）组成。其中，FSW提取特征的幅度谱，并使用组白化损失选择性地去除幅度中冗余的特定域信息。为了进一步消除实例之间的领域差异，我们采用风格迁移方法对数据进行扩充，并在CFA模块中使用扩充后的数据。CFA将RoI的原始特征和增强特征分成一系列不同类别的组，并利用它们之间的对比学习来促进不同类别DIR的学习。实验表明，该方法在现有的标准基准测试中取得了良好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Multimedia 工程技术-电信学

CiteScore

11.70

自引率

11.00%

发文量

576

审稿时长

5.5 months

期刊介绍： The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.