用于无监督突出物体检测的对比学习框架

IF 13.7

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2025-04-14 DOI:10.1109/TIP.2025.3558674

Huankang Guan;Jiaying Lin;Rynson W. H. Lau

{"title":"用于无监督突出物体检测的对比学习框架","authors":"Huankang Guan;Jiaying Lin;Rynson W. H. Lau","doi":"10.1109/TIP.2025.3558674","DOIUrl":null,"url":null,"abstract":"Existing unsupervised salient object detection (USOD) methods usually rely on low-level saliency priors, such as center and background priors, to detect salient objects, resulting in insufficient high-level semantic understanding. These low-level priors can be fragile and lead to failure when the natural images do not satisfy the prior assumptions, e.g., these methods may fail to detect those off-center salient objects causing fragmented objects in the segmentation. To address these problems, we propose to eliminate the dependency on flimsy low-level priors, and extract high-level saliency from natural images through a contrastive learning framework. To this end, we propose a Contrastive Saliency Network (CSNet), which is a prior-free and label-free saliency detector, with two novel modules: 1) a Contrastive Saliency Extraction (CSE) module to extract high-level saliency cues, by mimicking the human attention mechanism within an instance discriminative task through a contrastive learning framework, and 2) a Feature Re-Coordinate (FRC) module to recover spatial details, by calibrating high-level features with low-level features in an unsupervised fashion. In addition, we introduce a novel local appearance triplet (LAT) loss to assist the training process by encouraging similar saliency scores for regions with homogeneous appearances. Extensive experiments show that our approach is effective and outperforms state-of-the-art methods on popular SOD benchmarks.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2487-2498"},"PeriodicalIF":13.7000,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Contrastive-Learning Framework for Unsupervised Salient Object Detection\",\"authors\":\"Huankang Guan;Jiaying Lin;Rynson W. H. Lau\",\"doi\":\"10.1109/TIP.2025.3558674\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Existing unsupervised salient object detection (USOD) methods usually rely on low-level saliency priors, such as center and background priors, to detect salient objects, resulting in insufficient high-level semantic understanding. These low-level priors can be fragile and lead to failure when the natural images do not satisfy the prior assumptions, e.g., these methods may fail to detect those off-center salient objects causing fragmented objects in the segmentation. To address these problems, we propose to eliminate the dependency on flimsy low-level priors, and extract high-level saliency from natural images through a contrastive learning framework. To this end, we propose a Contrastive Saliency Network (CSNet), which is a prior-free and label-free saliency detector, with two novel modules: 1) a Contrastive Saliency Extraction (CSE) module to extract high-level saliency cues, by mimicking the human attention mechanism within an instance discriminative task through a contrastive learning framework, and 2) a Feature Re-Coordinate (FRC) module to recover spatial details, by calibrating high-level features with low-level features in an unsupervised fashion. In addition, we introduce a novel local appearance triplet (LAT) loss to assist the training process by encouraging similar saliency scores for regions with homogeneous appearances. Extensive experiments show that our approach is effective and outperforms state-of-the-art methods on popular SOD benchmarks.\",\"PeriodicalId\":94032,\"journal\":{\"name\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"volume\":\"34 \",\"pages\":\"2487-2498\"},\"PeriodicalIF\":13.7000,\"publicationDate\":\"2025-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10964591/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10964591/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

现有的无监督显著目标检测方法通常依赖于中心先验和背景先验等低级显著性先验来检测显著目标，导致缺乏高层次的语义理解。这些低级先验是脆弱的，当自然图像不满足先验假设时，这些方法可能会失败，例如，这些方法可能无法检测到偏离中心的显著目标，从而导致分割过程中物体碎片化。为了解决这些问题，我们提出消除对脆弱的低级先验的依赖，并通过对比学习框架从自然图像中提取高级显著性。为此，我们提出了一个对比显著性网络（CSNet），它是一个无先验和无标签的显著性检测器，具有两个新颖的模块：1)对比显著性提取（CSE）模块，通过对比学习框架模拟实例判别任务中的人类注意机制，提取高水平显著性线索；2)特征重新协调（FRC）模块，通过无监督方式校准高水平特征与低水平特征，恢复空间细节。此外，我们引入了一种新的局部外观三联体（LAT）损失，通过鼓励具有均匀外观的区域的相似显著性分数来辅助训练过程。大量的实验表明，我们的方法是有效的，并且在流行的SOD基准测试中优于最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Contrastive-Learning Framework for Unsupervised Salient Object Detection

Existing unsupervised salient object detection (USOD) methods usually rely on low-level saliency priors, such as center and background priors, to detect salient objects, resulting in insufficient high-level semantic understanding. These low-level priors can be fragile and lead to failure when the natural images do not satisfy the prior assumptions, e.g., these methods may fail to detect those off-center salient objects causing fragmented objects in the segmentation. To address these problems, we propose to eliminate the dependency on flimsy low-level priors, and extract high-level saliency from natural images through a contrastive learning framework. To this end, we propose a Contrastive Saliency Network (CSNet), which is a prior-free and label-free saliency detector, with two novel modules: 1) a Contrastive Saliency Extraction (CSE) module to extract high-level saliency cues, by mimicking the human attention mechanism within an instance discriminative task through a contrastive learning framework, and 2) a Feature Re-Coordinate (FRC) module to recover spatial details, by calibrating high-level features with low-level features in an unsupervised fashion. In addition, we introduce a novel local appearance triplet (LAT) loss to assist the training process by encouraging similar saliency scores for regions with homogeneous appearances. Extensive experiments show that our approach is effective and outperforms state-of-the-art methods on popular SOD benchmarks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量