基于文本引导特征分解的增强型目标检测领域泛化方法

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing Pub Date : 2024-11-07 DOI:10.1016/j.dsp.2024.104855

Meng Wang, Yudong Liu, Haipeng Liu

{"title":"基于文本引导特征分解的增强型目标检测领域泛化方法","authors":"Meng Wang, Yudong Liu, Haipeng Liu","doi":"10.1016/j.dsp.2024.104855","DOIUrl":null,"url":null,"abstract":"<div><div>The application scenarios of object detection models are constantly changing, due to the alternation of day and night and weather changes. Detector often suffers from the scarcity of training sets on potential domains. Recently, this challenge known as domain shift has been relieved by single domain generalization (SDG). To further generalize towards multiple unseen domains, this paper proposes a detector that uses text semantic gaps to enhance scene diversity and utilizes feature disentangling to extract domain-invariant features from different scenes, thereby improving detection accuracy. Firstly, random semantic augmentation (RSA) is adopted leveraging the text modality to capture semantically generalized representations, thereby augmenting the diversity of domain related information. Second, by broadening the decision boundary between domain-invariant and domain-specific features, feature disentangling (FD) branches are applied to improve the detector's object-background differentiation. Additionally, a cross modality alignment (CMA) is performed by estimating the relevances between domain-specific features and textual domain prompts. Experimental results show the proposed detector has excellent performance among existing baselines on diverse weather conditions, such as rainy, foggy and night rainy, which also confirms the enhanced generalization ability on multiple unseen domains.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104855"},"PeriodicalIF":2.9000,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An enhanced domain generalization method for object detection based on text guided feature disentanglement\",\"authors\":\"Meng Wang, Yudong Liu, Haipeng Liu\",\"doi\":\"10.1016/j.dsp.2024.104855\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The application scenarios of object detection models are constantly changing, due to the alternation of day and night and weather changes. Detector often suffers from the scarcity of training sets on potential domains. Recently, this challenge known as domain shift has been relieved by single domain generalization (SDG). To further generalize towards multiple unseen domains, this paper proposes a detector that uses text semantic gaps to enhance scene diversity and utilizes feature disentangling to extract domain-invariant features from different scenes, thereby improving detection accuracy. Firstly, random semantic augmentation (RSA) is adopted leveraging the text modality to capture semantically generalized representations, thereby augmenting the diversity of domain related information. Second, by broadening the decision boundary between domain-invariant and domain-specific features, feature disentangling (FD) branches are applied to improve the detector's object-background differentiation. Additionally, a cross modality alignment (CMA) is performed by estimating the relevances between domain-specific features and textual domain prompts. Experimental results show the proposed detector has excellent performance among existing baselines on diverse weather conditions, such as rainy, foggy and night rainy, which also confirms the enhanced generalization ability on multiple unseen domains.</div></div>\",\"PeriodicalId\":51011,\"journal\":{\"name\":\"Digital Signal Processing\",\"volume\":\"156 \",\"pages\":\"Article 104855\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1051200424004809\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200424004809","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

由于昼夜交替和天气变化，物体检测模型的应用场景不断变化。检测器经常会受到潜在领域训练集稀缺的困扰。最近，单域泛化（SDG）技术缓解了这一被称为 "域转移 "的挑战。为了进一步泛化到多个未见域，本文提出了一种检测器，利用文本语义间隙增强场景多样性，并利用特征分解从不同场景中提取域不变特征，从而提高检测精度。首先，采用随机语义增强（RSA）技术，利用文本模式捕捉语义泛化表征，从而增强领域相关信息的多样性。其次，通过拓宽领域不变特征和特定领域特征之间的决策边界，应用特征分离（FD）分支来提高检测器的物体-背景区分度。此外，通过估计特定领域特征与文本领域提示之间的相关性，还进行了跨模态对齐（CMA）。实验结果表明，在雨天、雾天和夜雨等不同天气条件下，所提出的检测器在现有基线中表现出色，这也证实了其在多个未见领域中增强的泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An enhanced domain generalization method for object detection based on text guided feature disentanglement

The application scenarios of object detection models are constantly changing, due to the alternation of day and night and weather changes. Detector often suffers from the scarcity of training sets on potential domains. Recently, this challenge known as domain shift has been relieved by single domain generalization (SDG). To further generalize towards multiple unseen domains, this paper proposes a detector that uses text semantic gaps to enhance scene diversity and utilizes feature disentangling to extract domain-invariant features from different scenes, thereby improving detection accuracy. Firstly, random semantic augmentation (RSA) is adopted leveraging the text modality to capture semantically generalized representations, thereby augmenting the diversity of domain related information. Second, by broadening the decision boundary between domain-invariant and domain-specific features, feature disentangling (FD) branches are applied to improve the detector's object-background differentiation. Additionally, a cross modality alignment (CMA) is performed by estimating the relevances between domain-specific features and textual domain prompts. Experimental results show the proposed detector has excellent performance among existing baselines on diverse weather conditions, such as rainy, foggy and night rainy, which also confirms the enhanced generalization ability on multiple unseen domains.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Digital Signal Processing 工程技术-工程：电子与电气

CiteScore

5.30

自引率

17.20%

发文量

435

审稿时长

66 days

期刊介绍： Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,