Kongming Liang, Zijin Yin, Min Min, Yan Liu, Zhanyu Ma, Jun Guo
{"title":"学习动态原型以消除视觉图案杂质","authors":"Kongming Liang, Zijin Yin, Min Min, Yan Liu, Zhanyu Ma, Jun Guo","doi":"10.1007/s11263-023-01956-x","DOIUrl":null,"url":null,"abstract":"<p>Deep learning has achieved great success in academic benchmarks but fails to work effectively in the real world due to the potential dataset bias. The current learning methods are prone to inheriting or even amplifying the bias present in a training dataset and under-represent specific demographic groups. More recently, some dataset debiasing methods have been developed to address the above challenges based on the awareness of protected or sensitive attribute labels. However, the number of protected or sensitive attributes may be considerably large, making it laborious and costly to acquire sufficient manual annotation. To this end, we propose a prototype-based network to dynamically balance the learning of different subgroups for a given dataset. First, an object pattern embedding mechanism is presented to make the network focus on the foreground region. Then we design a prototype learning method to discover and extract the visual patterns from the training data in an unsupervised way. The number of prototypes is dynamic depending on the pattern structure of the feature space. We evaluate the proposed prototype-based network on three widely used polyp segmentation datasets with abundant qualitative and quantitative experiments. Experimental results show that our proposed method outperforms the CNN-based and transformer-based state-of-the-art methods in terms of both effectiveness and fairness metrics. Moreover, extensive ablation studies are conducted to show the effectiveness of each proposed component and various parameter values. Lastly, we analyze how the number of prototypes grows during the training process and visualize the associated subgroups for each learned prototype. The code and data will be released at https://github.com/zijinY/dynamic-prototype-debiasing.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"48 1","pages":""},"PeriodicalIF":11.6000,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning Dynamic Prototypes for Visual Pattern Debiasing\",\"authors\":\"Kongming Liang, Zijin Yin, Min Min, Yan Liu, Zhanyu Ma, Jun Guo\",\"doi\":\"10.1007/s11263-023-01956-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Deep learning has achieved great success in academic benchmarks but fails to work effectively in the real world due to the potential dataset bias. The current learning methods are prone to inheriting or even amplifying the bias present in a training dataset and under-represent specific demographic groups. More recently, some dataset debiasing methods have been developed to address the above challenges based on the awareness of protected or sensitive attribute labels. However, the number of protected or sensitive attributes may be considerably large, making it laborious and costly to acquire sufficient manual annotation. To this end, we propose a prototype-based network to dynamically balance the learning of different subgroups for a given dataset. First, an object pattern embedding mechanism is presented to make the network focus on the foreground region. Then we design a prototype learning method to discover and extract the visual patterns from the training data in an unsupervised way. The number of prototypes is dynamic depending on the pattern structure of the feature space. We evaluate the proposed prototype-based network on three widely used polyp segmentation datasets with abundant qualitative and quantitative experiments. Experimental results show that our proposed method outperforms the CNN-based and transformer-based state-of-the-art methods in terms of both effectiveness and fairness metrics. Moreover, extensive ablation studies are conducted to show the effectiveness of each proposed component and various parameter values. Lastly, we analyze how the number of prototypes grows during the training process and visualize the associated subgroups for each learned prototype. The code and data will be released at https://github.com/zijinY/dynamic-prototype-debiasing.</p>\",\"PeriodicalId\":13752,\"journal\":{\"name\":\"International Journal of Computer Vision\",\"volume\":\"48 1\",\"pages\":\"\"},\"PeriodicalIF\":11.6000,\"publicationDate\":\"2023-12-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Vision\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11263-023-01956-x\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11263-023-01956-x","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Learning Dynamic Prototypes for Visual Pattern Debiasing
Deep learning has achieved great success in academic benchmarks but fails to work effectively in the real world due to the potential dataset bias. The current learning methods are prone to inheriting or even amplifying the bias present in a training dataset and under-represent specific demographic groups. More recently, some dataset debiasing methods have been developed to address the above challenges based on the awareness of protected or sensitive attribute labels. However, the number of protected or sensitive attributes may be considerably large, making it laborious and costly to acquire sufficient manual annotation. To this end, we propose a prototype-based network to dynamically balance the learning of different subgroups for a given dataset. First, an object pattern embedding mechanism is presented to make the network focus on the foreground region. Then we design a prototype learning method to discover and extract the visual patterns from the training data in an unsupervised way. The number of prototypes is dynamic depending on the pattern structure of the feature space. We evaluate the proposed prototype-based network on three widely used polyp segmentation datasets with abundant qualitative and quantitative experiments. Experimental results show that our proposed method outperforms the CNN-based and transformer-based state-of-the-art methods in terms of both effectiveness and fairness metrics. Moreover, extensive ablation studies are conducted to show the effectiveness of each proposed component and various parameter values. Lastly, we analyze how the number of prototypes grows during the training process and visualize the associated subgroups for each learned prototype. The code and data will be released at https://github.com/zijinY/dynamic-prototype-debiasing.
期刊介绍:
The International Journal of Computer Vision (IJCV) serves as a platform for sharing new research findings in the rapidly growing field of computer vision. It publishes 12 issues annually and presents high-quality, original contributions to the science and engineering of computer vision. The journal encompasses various types of articles to cater to different research outputs.
Regular articles, which span up to 25 journal pages, focus on significant technical advancements that are of broad interest to the field. These articles showcase substantial progress in computer vision.
Short articles, limited to 10 pages, offer a swift publication path for novel research outcomes. They provide a quicker means for sharing new findings with the computer vision community.
Survey articles, comprising up to 30 pages, offer critical evaluations of the current state of the art in computer vision or offer tutorial presentations of relevant topics. These articles provide comprehensive and insightful overviews of specific subject areas.
In addition to technical articles, the journal also includes book reviews, position papers, and editorials by prominent scientific figures. These contributions serve to complement the technical content and provide valuable perspectives.
The journal encourages authors to include supplementary material online, such as images, video sequences, data sets, and software. This additional material enhances the understanding and reproducibility of the published research.
Overall, the International Journal of Computer Vision is a comprehensive publication that caters to researchers in this rapidly growing field. It covers a range of article types, offers additional online resources, and facilitates the dissemination of impactful research.