Vartika Sengar, S. VivekB., Gaurab Bhattacharya, J. Gubbi, Arpan Pal, P. Balamuralidhar
{"title":"Low-level Bias discovery and Mitigation for Image Classification","authors":"Vartika Sengar, S. VivekB., Gaurab Bhattacharya, J. Gubbi, Arpan Pal, P. Balamuralidhar","doi":"10.1109/SPCOM55316.2022.9840811","DOIUrl":null,"url":null,"abstract":"Identification of bias and its mitigation in a classifier is a fundamental sanity check required in trustworthy AI systems. There have been many methods for mitigation of bias in literature that use bias as apriori information. In this work, we propose a system that can detect the low-level bias (e.g., color, texture) and mitigate the same. A novel auto-encoder architecture to explain the predictions made by a deep neural network is built that helps in identification of the bias. The auto-encoder is trained to produce a generalized representation of the input image by decomposing it into a set of latent embeddings. These embeddings are learned by specializing the group of higher dimensional feature maps to learn the disentangled color and shape concepts. The shape embeddings are trained to reconstruct discrete wavelet transform components of an image and the color embeddings are trained to capture the color information. The feature specialization is done by reconstructing the RGB image using the shape embeddings modulated by color embeddings. We have shown that these representations can be used to detect low level bias in a classification task. Post detection of bias, we also propose a method to de-bias the classifier by training it with counterfactual images generated by manipulating the representations learned by the auto-encoder. We have shown that our proposed method of bias discovery and mitigation is able to achieve state-of-the-art results on ColorMNIST and the newly proposed BiasedShape dataset.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPCOM55316.2022.9840811","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Identification of bias and its mitigation in a classifier is a fundamental sanity check required in trustworthy AI systems. There have been many methods for mitigation of bias in literature that use bias as apriori information. In this work, we propose a system that can detect the low-level bias (e.g., color, texture) and mitigate the same. A novel auto-encoder architecture to explain the predictions made by a deep neural network is built that helps in identification of the bias. The auto-encoder is trained to produce a generalized representation of the input image by decomposing it into a set of latent embeddings. These embeddings are learned by specializing the group of higher dimensional feature maps to learn the disentangled color and shape concepts. The shape embeddings are trained to reconstruct discrete wavelet transform components of an image and the color embeddings are trained to capture the color information. The feature specialization is done by reconstructing the RGB image using the shape embeddings modulated by color embeddings. We have shown that these representations can be used to detect low level bias in a classification task. Post detection of bias, we also propose a method to de-bias the classifier by training it with counterfactual images generated by manipulating the representations learned by the auto-encoder. We have shown that our proposed method of bias discovery and mitigation is able to achieve state-of-the-art results on ColorMNIST and the newly proposed BiasedShape dataset.