学习动态原型以消除视觉图案杂质

IF 11.6 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Computer Vision Pub Date : 2023-12-08 DOI:10.1007/s11263-023-01956-x

Kongming Liang, Zijin Yin, Min Min, Yan Liu, Zhanyu Ma, Jun Guo

{"title":"学习动态原型以消除视觉图案杂质","authors":"Kongming Liang, Zijin Yin, Min Min, Yan Liu, Zhanyu Ma, Jun Guo","doi":"10.1007/s11263-023-01956-x","DOIUrl":null,"url":null,"abstract":"<p>Deep learning has achieved great success in academic benchmarks but fails to work effectively in the real world due to the potential dataset bias. The current learning methods are prone to inheriting or even amplifying the bias present in a training dataset and under-represent specific demographic groups. More recently, some dataset debiasing methods have been developed to address the above challenges based on the awareness of protected or sensitive attribute labels. However, the number of protected or sensitive attributes may be considerably large, making it laborious and costly to acquire sufficient manual annotation. To this end, we propose a prototype-based network to dynamically balance the learning of different subgroups for a given dataset. First, an object pattern embedding mechanism is presented to make the network focus on the foreground region. Then we design a prototype learning method to discover and extract the visual patterns from the training data in an unsupervised way. The number of prototypes is dynamic depending on the pattern structure of the feature space. We evaluate the proposed prototype-based network on three widely used polyp segmentation datasets with abundant qualitative and quantitative experiments. Experimental results show that our proposed method outperforms the CNN-based and transformer-based state-of-the-art methods in terms of both effectiveness and fairness metrics. Moreover, extensive ablation studies are conducted to show the effectiveness of each proposed component and various parameter values. Lastly, we analyze how the number of prototypes grows during the training process and visualize the associated subgroups for each learned prototype. The code and data will be released at https://github.com/zijinY/dynamic-prototype-debiasing.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"48 1","pages":""},"PeriodicalIF":11.6000,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning Dynamic Prototypes for Visual Pattern Debiasing\",\"authors\":\"Kongming Liang, Zijin Yin, Min Min, Yan Liu, Zhanyu Ma, Jun Guo\",\"doi\":\"10.1007/s11263-023-01956-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Deep learning has achieved great success in academic benchmarks but fails to work effectively in the real world due to the potential dataset bias. The current learning methods are prone to inheriting or even amplifying the bias present in a training dataset and under-represent specific demographic groups. More recently, some dataset debiasing methods have been developed to address the above challenges based on the awareness of protected or sensitive attribute labels. However, the number of protected or sensitive attributes may be considerably large, making it laborious and costly to acquire sufficient manual annotation. To this end, we propose a prototype-based network to dynamically balance the learning of different subgroups for a given dataset. First, an object pattern embedding mechanism is presented to make the network focus on the foreground region. Then we design a prototype learning method to discover and extract the visual patterns from the training data in an unsupervised way. The number of prototypes is dynamic depending on the pattern structure of the feature space. We evaluate the proposed prototype-based network on three widely used polyp segmentation datasets with abundant qualitative and quantitative experiments. Experimental results show that our proposed method outperforms the CNN-based and transformer-based state-of-the-art methods in terms of both effectiveness and fairness metrics. Moreover, extensive ablation studies are conducted to show the effectiveness of each proposed component and various parameter values. Lastly, we analyze how the number of prototypes grows during the training process and visualize the associated subgroups for each learned prototype. The code and data will be released at https://github.com/zijinY/dynamic-prototype-debiasing.</p>\",\"PeriodicalId\":13752,\"journal\":{\"name\":\"International Journal of Computer Vision\",\"volume\":\"48 1\",\"pages\":\"\"},\"PeriodicalIF\":11.6000,\"publicationDate\":\"2023-12-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Vision\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11263-023-01956-x\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11263-023-01956-x","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

深度学习在学术基准方面取得了巨大成功，但由于潜在的数据集偏差，在现实世界中却无法有效发挥作用。当前的学习方法很容易继承甚至放大训练数据集中存在的偏差，对特定人群的代表性不足。最近，基于对受保护或敏感属性标签的认识，人们开发了一些数据集去偏方法来应对上述挑战。然而，受保护或敏感属性的数量可能相当大，因此要获得足够的人工标注既费力又昂贵。为此，我们提出了一种基于原型的网络，以动态平衡给定数据集不同子群的学习。首先，我们提出了一种对象模式嵌入机制，使网络专注于前景区域。然后，我们设计了一种原型学习方法，以无监督的方式从训练数据中发现和提取视觉模式。原型的数量是动态的，取决于特征空间的模式结构。我们在三个广泛使用的息肉分割数据集上对所提出的基于原型的网络进行了评估，并进行了丰富的定性和定量实验。实验结果表明，我们提出的方法在有效性和公平性指标上都优于基于 CNN 和基于变换器的最先进方法。此外，我们还进行了大量的消融研究，以显示每个建议组件和不同参数值的有效性。最后，我们分析了原型数量在训练过程中的增长情况，并对每个学习原型的相关子群进行了可视化。代码和数据将在 https://github.com/zijinY/dynamic-prototype-debiasing 上发布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Learning Dynamic Prototypes for Visual Pattern Debiasing

查看原文本刊更多论文

Learning Dynamic Prototypes for Visual Pattern Debiasing

Deep learning has achieved great success in academic benchmarks but fails to work effectively in the real world due to the potential dataset bias. The current learning methods are prone to inheriting or even amplifying the bias present in a training dataset and under-represent specific demographic groups. More recently, some dataset debiasing methods have been developed to address the above challenges based on the awareness of protected or sensitive attribute labels. However, the number of protected or sensitive attributes may be considerably large, making it laborious and costly to acquire sufficient manual annotation. To this end, we propose a prototype-based network to dynamically balance the learning of different subgroups for a given dataset. First, an object pattern embedding mechanism is presented to make the network focus on the foreground region. Then we design a prototype learning method to discover and extract the visual patterns from the training data in an unsupervised way. The number of prototypes is dynamic depending on the pattern structure of the feature space. We evaluate the proposed prototype-based network on three widely used polyp segmentation datasets with abundant qualitative and quantitative experiments. Experimental results show that our proposed method outperforms the CNN-based and transformer-based state-of-the-art methods in terms of both effectiveness and fairness metrics. Moreover, extensive ablation studies are conducted to show the effectiveness of each proposed component and various parameter values. Lastly, we analyze how the number of prototypes grows during the training process and visualize the associated subgroups for each learned prototype. The code and data will be released at https://github.com/zijinY/dynamic-prototype-debiasing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Computer Vision 工程技术-计算机：人工智能

CiteScore

29.80

自引率

2.10%

发文量

163

审稿时长

6 months

期刊介绍： The International Journal of Computer Vision (IJCV) serves as a platform for sharing new research findings in the rapidly growing field of computer vision. It publishes 12 issues annually and presents high-quality, original contributions to the science and engineering of computer vision. The journal encompasses various types of articles to cater to different research outputs. Regular articles, which span up to 25 journal pages, focus on significant technical advancements that are of broad interest to the field. These articles showcase substantial progress in computer vision. Short articles, limited to 10 pages, offer a swift publication path for novel research outcomes. They provide a quicker means for sharing new findings with the computer vision community. Survey articles, comprising up to 30 pages, offer critical evaluations of the current state of the art in computer vision or offer tutorial presentations of relevant topics. These articles provide comprehensive and insightful overviews of specific subject areas. In addition to technical articles, the journal also includes book reviews, position papers, and editorials by prominent scientific figures. These contributions serve to complement the technical content and provide valuable perspectives. The journal encourages authors to include supplementary material online, such as images, video sequences, data sets, and software. This additional material enhances the understanding and reproducibility of the published research. Overall, the International Journal of Computer Vision is a comprehensive publication that caters to researchers in this rapidly growing field. It covers a range of article types, offers additional online resources, and facilitates the dissemination of impactful research.