MLFA: Toward Realistic Test Time Adaptive Object Detection by Multi-Level Feature Alignment

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2024-10-09 DOI:10.1109/TIP.2024.3473532

Yabo Liu;Jinghua Wang;Chao Huang;Yiling Wu;Yong Xu;Xiaochun Cao

{"title":"MLFA: Toward Realistic Test Time Adaptive Object Detection by Multi-Level Feature Alignment","authors":"Yabo Liu;Jinghua Wang;Chao Huang;Yiling Wu;Yong Xu;Xiaochun Cao","doi":"10.1109/TIP.2024.3473532","DOIUrl":null,"url":null,"abstract":"Object detection methods have achieved remarkable performances when the training and testing data satisfy the assumption of i.i.d. However, the training and testing data may be collected from different domains, and the gap between the domains can significantly degrade the detectors. Test Time Adaptive Object Detection (TTA-OD) is a novel online approach that aims to adapt detectors quickly and make predictions during the testing procedure. TTA-OD is more realistic than the existing unsupervised domain adaptation and source-free unsupervised domain adaptation approaches. For example, self-driving cars need to improve their perception of new environments in the TTA-OD paradigm during driving. To address this, we propose a multi-level feature alignment (MLFA) method for TTA-OD, which is able to adapt the model online based on the steaming target domain data. For a more straightforward adaptation, we select informative foreground and background features from image feature maps and capture their distributions using probabilistic models. Our approach includes: i) global-level feature alignment to align all informative feature distributions, thereby encouraging detectors to extract domain-invariant features, and ii) cluster-level feature alignment to match feature distributions for each category cluster across different domains. Through the multi-level alignment, we can prompt detectors to extract domain-invariant features, as well as align the category-specific components of image features from distinct domains. We conduct extensive experiments to verify the effectiveness of our proposed method. Our code is accessible at \n<uri>https://github.com/yaboliudotug/MLFA</uri>\n.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5837-5848"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10713112/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Object detection methods have achieved remarkable performances when the training and testing data satisfy the assumption of i.i.d. However, the training and testing data may be collected from different domains, and the gap between the domains can significantly degrade the detectors. Test Time Adaptive Object Detection (TTA-OD) is a novel online approach that aims to adapt detectors quickly and make predictions during the testing procedure. TTA-OD is more realistic than the existing unsupervised domain adaptation and source-free unsupervised domain adaptation approaches. For example, self-driving cars need to improve their perception of new environments in the TTA-OD paradigm during driving. To address this, we propose a multi-level feature alignment (MLFA) method for TTA-OD, which is able to adapt the model online based on the steaming target domain data. For a more straightforward adaptation, we select informative foreground and background features from image feature maps and capture their distributions using probabilistic models. Our approach includes: i) global-level feature alignment to align all informative feature distributions, thereby encouraging detectors to extract domain-invariant features, and ii) cluster-level feature alignment to match feature distributions for each category cluster across different domains. Through the multi-level alignment, we can prompt detectors to extract domain-invariant features, as well as align the category-specific components of image features from distinct domains. We conduct extensive experiments to verify the effectiveness of our proposed method. Our code is accessible at https://github.com/yaboliudotug/MLFA .

查看原文本刊更多论文

MLFA：通过多级特征对齐实现测试时间自适应物体检测。

当训练数据和测试数据满足 i.i.d 假设时，物体检测方法取得了显著的性能。然而，训练数据和测试数据可能来自不同的领域，领域之间的差距可能会显著降低检测器的性能。测试时间自适应目标检测（TTA-OD）是一种新颖的在线方法，旨在快速调整检测器，并在测试过程中做出预测。与现有的无监督域自适应和无源无监督域自适应方法相比，TTA-OD 更符合实际情况。例如，在 TTA-OD 范例中，自动驾驶汽车需要在行驶过程中提高对新环境的感知能力。为此，我们提出了一种用于 TTA-OD 的多层次特征对齐（MLFA）方法，该方法能够根据蒸发目标域数据对模型进行在线适配。为了更直接地调整模型，我们从图像特征图中选择信息丰富的前景和背景特征，并使用概率模型捕捉它们的分布。我们的方法包括：i) 全局级特征对齐，对齐所有信息特征分布，从而鼓励检测器提取域不变特征；ii) 集群级特征对齐，匹配不同域中每个类别集群的特征分布。通过多级对齐，我们可以促使检测器提取与领域无关的特征，并对齐来自不同领域的图像特征的特定类别成分。我们进行了大量实验来验证我们提出的方法的有效性。我们的代码可通过 https://github.com/yaboliudotug/MLFA 访问。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量