{"title":"Zero-Shot Demographically Unbiased Image Generation From an Existing Biased StyleGAN","authors":"Anubhav Jain;Rishit Dholakia;Nasir Memon;Julian Togelius","doi":"10.1109/TBIOM.2024.3416403","DOIUrl":null,"url":null,"abstract":"Face recognition systems have made significant strides thanks to data-heavy deep learning models, but these models rely on large privacy-sensitive datasets. Recent work in facial analysis and recognition have thus started making use of synthetic datasets generated from GANs and diffusion based generative models. These models, however, lack fairness in terms of demographic representation and can introduce the same biases in the trained downstream tasks. This can have serious societal and security implications. To address this issue, we propose a methodology that generates unbiased data from a biased generative model using an evolutionary algorithm. We show results for StyleGAN2 model trained on the Flicker Faces High Quality dataset to generate data for singular and combinations of demographic attributes such as Black and Woman. We generate a large racially balanced dataset of 13.5 million images, and show that it boosts the performance of facial recognition and analysis systems whilst reducing their biases. We have made our code-base (\n<uri>https://github.com/anubhav1997/youneednodataset</uri>\n) public to allow researchers to reproduce our work.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 4","pages":"498-514"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10561535/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Face recognition systems have made significant strides thanks to data-heavy deep learning models, but these models rely on large privacy-sensitive datasets. Recent work in facial analysis and recognition have thus started making use of synthetic datasets generated from GANs and diffusion based generative models. These models, however, lack fairness in terms of demographic representation and can introduce the same biases in the trained downstream tasks. This can have serious societal and security implications. To address this issue, we propose a methodology that generates unbiased data from a biased generative model using an evolutionary algorithm. We show results for StyleGAN2 model trained on the Flicker Faces High Quality dataset to generate data for singular and combinations of demographic attributes such as Black and Woman. We generate a large racially balanced dataset of 13.5 million images, and show that it boosts the performance of facial recognition and analysis systems whilst reducing their biases. We have made our code-base (
https://github.com/anubhav1997/youneednodataset
) public to allow researchers to reproduce our work.