SRCD：针对单域通用对象检测的复合域语义推理。

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE transactions on neural networks and learning systems Pub Date : 2024-11-05 DOI:10.1109/TNNLS.2024.3480120

Zhijie Rao;Jingcai Guo;Luyao Tang;Yue Huang;Xinghao Ding;Song Guo

{"title":"SRCD：针对单域通用对象检测的复合域语义推理。","authors":"Zhijie Rao;Jingcai Guo;Luyao Tang;Yue Huang;Xinghao Ding;Song Guo","doi":"10.1109/TNNLS.2024.3480120","DOIUrl":null,"url":null,"abstract":"This article provides a novel framework for single-domain generalized object detection (i.e., Single-DGOD), where we are interested in learning and maintaining the semantic structures of self-augmented compound cross-domain samples to enhance the model’s generalization ability. Different from domain generalized object detection (DGOD) trained on multiple source domains, Single-DGOD is far more challenging to generalize well to multiple target domains with only one single source domain. Existing methods mostly adopt a similar treatment from DGOD to learn domain-invariant features by decoupling or compressing the semantic space. However, there may exist two potential limitations: 1) pseudo attribute-label correlation due to extremely scarce single-domain data and 2) the semantic structural information is usually ignored, i.e., we found the affinities of instance-level semantic relations in samples are crucial to model generalization. In this article, we introduce semantic reasoning with compound domains (SRCD) for Single-DGOD. Specifically, our SRCD contains two main components, namely, the texture-based self-augmentation (TBSA) module and the local-global semantic reasoning (LGSR) module. TBSA aims to eliminate the effects of irrelevant attributes associated with labels, such as light, shadow, and color, at the image level by a light-yet-efficient self-augmentation. Moreover, LGSR is used to further model the semantic relationships on instance features to uncover and maintain the intrinsic semantic structures. Extensive experiments on multiple benchmarks demonstrate the effectiveness of the proposed SRCD. Code is available at github.com/zjrao/SRCD.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 7","pages":"12497-12506"},"PeriodicalIF":8.9000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SRCD: Semantic Reasoning With Compound Domains for Single-Domain Generalized Object Detection\",\"authors\":\"Zhijie Rao;Jingcai Guo;Luyao Tang;Yue Huang;Xinghao Ding;Song Guo\",\"doi\":\"10.1109/TNNLS.2024.3480120\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article provides a novel framework for single-domain generalized object detection (i.e., Single-DGOD), where we are interested in learning and maintaining the semantic structures of self-augmented compound cross-domain samples to enhance the model’s generalization ability. Different from domain generalized object detection (DGOD) trained on multiple source domains, Single-DGOD is far more challenging to generalize well to multiple target domains with only one single source domain. Existing methods mostly adopt a similar treatment from DGOD to learn domain-invariant features by decoupling or compressing the semantic space. However, there may exist two potential limitations: 1) pseudo attribute-label correlation due to extremely scarce single-domain data and 2) the semantic structural information is usually ignored, i.e., we found the affinities of instance-level semantic relations in samples are crucial to model generalization. In this article, we introduce semantic reasoning with compound domains (SRCD) for Single-DGOD. Specifically, our SRCD contains two main components, namely, the texture-based self-augmentation (TBSA) module and the local-global semantic reasoning (LGSR) module. TBSA aims to eliminate the effects of irrelevant attributes associated with labels, such as light, shadow, and color, at the image level by a light-yet-efficient self-augmentation. Moreover, LGSR is used to further model the semantic relationships on instance features to uncover and maintain the intrinsic semantic structures. Extensive experiments on multiple benchmarks demonstrate the effectiveness of the proposed SRCD. Code is available at github.com/zjrao/SRCD.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"36 7\",\"pages\":\"12497-12506\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2024-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10742956/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10742956/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

本文为单域泛化对象检测（即单域泛化对象检测，Single-DGOD）提供了一个新颖的框架，我们希望通过学习和维护自增复合跨域样本的语义结构来增强模型的泛化能力。与在多个源域上训练的域泛化对象检测（DGOD）不同，单域泛化对象检测（Single-DGOD）要想在只有一个源域的情况下很好地泛化到多个目标域，难度要大得多。现有方法大多采用与 DGOD 类似的处理方法，通过解耦或压缩语义空间来学习域不变特征。然而，这可能存在两个潜在的局限性：1）由于单领域数据极其稀少，导致伪属性-标签相关性；2）语义结构信息通常被忽视，即我们发现样本中实例级语义关系的亲和性对于模型泛化至关重要。在本文中，我们介绍了针对 Single-DGOD 的复合域语义推理（SRCD）。具体来说，我们的 SRCD 包含两个主要部分，即基于纹理的自增强（TBSA）模块和局部-全局语义推理（LGSR）模块。TBSA 的目的是通过轻便高效的自增强技术，在图像层面消除与光影和颜色等标签相关的无关属性的影响。此外，LGSR 还用于对实例特征的语义关系进行进一步建模，以发现并维护内在语义结构。在多个基准上进行的广泛实验证明了所提出的 SRCD 的有效性。代码见 github.com/zjrao/SRCD。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SRCD: Semantic Reasoning With Compound Domains for Single-Domain Generalized Object Detection

This article provides a novel framework for single-domain generalized object detection (i.e., Single-DGOD), where we are interested in learning and maintaining the semantic structures of self-augmented compound cross-domain samples to enhance the model’s generalization ability. Different from domain generalized object detection (DGOD) trained on multiple source domains, Single-DGOD is far more challenging to generalize well to multiple target domains with only one single source domain. Existing methods mostly adopt a similar treatment from DGOD to learn domain-invariant features by decoupling or compressing the semantic space. However, there may exist two potential limitations: 1) pseudo attribute-label correlation due to extremely scarce single-domain data and 2) the semantic structural information is usually ignored, i.e., we found the affinities of instance-level semantic relations in samples are crucial to model generalization. In this article, we introduce semantic reasoning with compound domains (SRCD) for Single-DGOD. Specifically, our SRCD contains two main components, namely, the texture-based self-augmentation (TBSA) module and the local-global semantic reasoning (LGSR) module. TBSA aims to eliminate the effects of irrelevant attributes associated with labels, such as light, shadow, and color, at the image level by a light-yet-efficient self-augmentation. Moreover, LGSR is used to further model the semantic relationships on instance features to uncover and maintain the intrinsic semantic structures. Extensive experiments on multiple benchmarks demonstrate the effectiveness of the proposed SRCD. Code is available at github.com/zjrao/SRCD.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.