Pixel is All You Need: Adversarial Spatio-Temporal Ensemble Active Learning for Salient Object Detection.

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2024-10-09 DOI:10.1109/TPAMI.2024.3476683

Zhenyu Wu, Wei Wang, Lin Wang, Yacong Li, Fengmao Lv, Qing Xia, Chenglizhao Chen, Aimin Hao, Shuo Li

{"title":"Pixel is All You Need: Adversarial Spatio-Temporal Ensemble Active Learning for Salient Object Detection.","authors":"Zhenyu Wu, Wei Wang, Lin Wang, Yacong Li, Fengmao Lv, Qing Xia, Chenglizhao Chen, Aimin Hao, Shuo Li","doi":"10.1109/TPAMI.2024.3476683","DOIUrl":null,"url":null,"abstract":"<p><p>Although weakly-supervised techniques can reduce the labeling effort, it is unclear whether a saliency model trained with weakly-supervised data (e.g., point annotation) can achieve the equivalent performance of its fully-supervised version. This paper attempts to answer this unexplored question by proving a hypothesis: there is a point-labeled dataset where saliency models trained on it can achieve equivalent performance when trained on the densely annotated dataset. To prove this conjecture, we proposed a novel yet effective adversarial spatio-temporal ensemble active learning. Our contributions are four- fold: 1) Our proposed adversarial attack triggering uncertainty can conquer the overconfidence of existing active learning methods and accurately locate these uncertain pixels. 2) Our proposed spatio-temporal ensemble strategy not only achieves outstanding performance but significantly reduces the model's computational cost. 3) Our proposed relationship-aware diversity sampling can conquer oversampling while boosting model performance. 4) We provide theoretical proof for the existence of such a point-labeled dataset. Experimental results show that our approach can find such a point-labeled dataset, where a saliency model trained on it obtained 98%-99% performance of its fully-supervised version with only ten annotated points per image. The code is available at https://github.com/wuzhenyubuaa/ASTE-AL.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2024.3476683","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Although weakly-supervised techniques can reduce the labeling effort, it is unclear whether a saliency model trained with weakly-supervised data (e.g., point annotation) can achieve the equivalent performance of its fully-supervised version. This paper attempts to answer this unexplored question by proving a hypothesis: there is a point-labeled dataset where saliency models trained on it can achieve equivalent performance when trained on the densely annotated dataset. To prove this conjecture, we proposed a novel yet effective adversarial spatio-temporal ensemble active learning. Our contributions are four- fold: 1) Our proposed adversarial attack triggering uncertainty can conquer the overconfidence of existing active learning methods and accurately locate these uncertain pixels. 2) Our proposed spatio-temporal ensemble strategy not only achieves outstanding performance but significantly reduces the model's computational cost. 3) Our proposed relationship-aware diversity sampling can conquer oversampling while boosting model performance. 4) We provide theoretical proof for the existence of such a point-labeled dataset. Experimental results show that our approach can find such a point-labeled dataset, where a saliency model trained on it obtained 98%-99% performance of its fully-supervised version with only ten annotated points per image. The code is available at https://github.com/wuzhenyubuaa/ASTE-AL.

查看原文本刊更多论文

像素就是你所需要的一切：用于显著物体检测的对抗性时空集合主动学习。

虽然弱监督技术可以减少标注工作量，但用弱监督数据（如点标注）训练的显著性模型是否能达到与其完全监督版本相当的性能，目前还不清楚。本文试图通过证明一个假设来回答这个尚未探索的问题：存在一个点标注数据集，在该数据集上训练的显著性模型可以达到在密集标注数据集上训练的同等性能。为了证明这一猜想，我们提出了一种新颖而有效的对抗性时空集合主动学习方法。我们的贡献包括四个方面：1）我们提出的触发不确定性的对抗性攻击可以克服现有主动学习方法的过度自信，并准确定位这些不确定像素。2) 我们提出的时空集合策略不仅实现了出色的性能，而且大大降低了模型的计算成本。3) 我们提出的关系感知多样性采样可以克服超采样，同时提高模型性能。4) 我们为这种点标记数据集的存在提供了理论证明。实验结果表明，我们的方法可以找到这样一个点标注数据集，在该数据集上训练的显著性模型在每幅图像只有十个标注点的情况下，性能达到了其完全监督版本的 98%-99%。代码见 https://github.com/wuzhenyubuaa/ASTE-AL。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量