{"title":"基于频率白化和对比学习的单域广义目标检测","authors":"Xiaolong Guo;Chengxu Liu;Xueming Qian;Zhixiao Wang;Xubin Feng;Yao Xue","doi":"10.1109/TMM.2025.3590915","DOIUrl":null,"url":null,"abstract":"Single-Domain Generalization Object Detection (Single-DGOD) refers to training a model with only one source domain, enabling the model to generalize to any unseen domain. For instance, a detector trained on a sunny daytime dataset should also perform well in scenarios such as rainy nighttime. The main challenge is to improve the detector’s ability to learn the domain-invariant representation (DIR) while removing domain-specific information. Recent progress in Single-DGOD has demonstrated the efficacy of removing domain-specific information by adjusting feature distributions. Nonetheless, simply adjusting the global feature distribution in Single-DGOD task is insufficient to learn the potential relationship from sunny to adverse weather, as these ignore the significant domain gaps between instances across different weathers. In this paper, we propose a novel object detection method for more robust single-domain generalization. In particular, it mainly consists of a frequency-aware selective whitening module (FSW) for removing redundant domain-specific information and a contrastive feature alignment module (CFA) for enhancing domain-invariant information among instances. Specially, FSW extracts the magnitude spectrum of the feature and uses a group whitening loss to selectively eliminate redundant domain-specific information in the magnitude. To further eliminate domain differences among instances, we apply the style transfer method for data augmentation and use the augmented data in the CFA module. CFA formulates both the original and the augmentd RoI features into a series of groups with different categories, and utilizes contrastive learning across them to facilitate the learning of DIR in various categories. Experiments show that our method achieves favorable performance on existing standard benchmarks.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"6805-6818"},"PeriodicalIF":9.7000,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Single-Domain Generalized Object Detection With Frequency Whitening and Contrastive Learning\",\"authors\":\"Xiaolong Guo;Chengxu Liu;Xueming Qian;Zhixiao Wang;Xubin Feng;Yao Xue\",\"doi\":\"10.1109/TMM.2025.3590915\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Single-Domain Generalization Object Detection (Single-DGOD) refers to training a model with only one source domain, enabling the model to generalize to any unseen domain. For instance, a detector trained on a sunny daytime dataset should also perform well in scenarios such as rainy nighttime. The main challenge is to improve the detector’s ability to learn the domain-invariant representation (DIR) while removing domain-specific information. Recent progress in Single-DGOD has demonstrated the efficacy of removing domain-specific information by adjusting feature distributions. Nonetheless, simply adjusting the global feature distribution in Single-DGOD task is insufficient to learn the potential relationship from sunny to adverse weather, as these ignore the significant domain gaps between instances across different weathers. In this paper, we propose a novel object detection method for more robust single-domain generalization. In particular, it mainly consists of a frequency-aware selective whitening module (FSW) for removing redundant domain-specific information and a contrastive feature alignment module (CFA) for enhancing domain-invariant information among instances. Specially, FSW extracts the magnitude spectrum of the feature and uses a group whitening loss to selectively eliminate redundant domain-specific information in the magnitude. To further eliminate domain differences among instances, we apply the style transfer method for data augmentation and use the augmented data in the CFA module. CFA formulates both the original and the augmentd RoI features into a series of groups with different categories, and utilizes contrastive learning across them to facilitate the learning of DIR in various categories. Experiments show that our method achieves favorable performance on existing standard benchmarks.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"27 \",\"pages\":\"6805-6818\"},\"PeriodicalIF\":9.7000,\"publicationDate\":\"2025-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11086412/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11086412/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Single-Domain Generalized Object Detection With Frequency Whitening and Contrastive Learning
Single-Domain Generalization Object Detection (Single-DGOD) refers to training a model with only one source domain, enabling the model to generalize to any unseen domain. For instance, a detector trained on a sunny daytime dataset should also perform well in scenarios such as rainy nighttime. The main challenge is to improve the detector’s ability to learn the domain-invariant representation (DIR) while removing domain-specific information. Recent progress in Single-DGOD has demonstrated the efficacy of removing domain-specific information by adjusting feature distributions. Nonetheless, simply adjusting the global feature distribution in Single-DGOD task is insufficient to learn the potential relationship from sunny to adverse weather, as these ignore the significant domain gaps between instances across different weathers. In this paper, we propose a novel object detection method for more robust single-domain generalization. In particular, it mainly consists of a frequency-aware selective whitening module (FSW) for removing redundant domain-specific information and a contrastive feature alignment module (CFA) for enhancing domain-invariant information among instances. Specially, FSW extracts the magnitude spectrum of the feature and uses a group whitening loss to selectively eliminate redundant domain-specific information in the magnitude. To further eliminate domain differences among instances, we apply the style transfer method for data augmentation and use the augmented data in the CFA module. CFA formulates both the original and the augmentd RoI features into a series of groups with different categories, and utilizes contrastive learning across them to facilitate the learning of DIR in various categories. Experiments show that our method achieves favorable performance on existing standard benchmarks.
期刊介绍:
The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.