训练数据中的性别平衡如何影响人脸识别的准确性?

2020 IEEE International Joint Conference on Biometrics (IJCB) Pub Date : 2020-02-07 DOI:10.1109/IJCB48548.2020.9304924

Vítor Albiero, Kai Zhang, K. Bowyer

{"title":"训练数据中的性别平衡如何影响人脸识别的准确性?","authors":"Vítor Albiero, Kai Zhang, K. Bowyer","doi":"10.1109/IJCB48548.2020.9304924","DOIUrl":null,"url":null,"abstract":"Deep learning methods have greatly increased the accuracy of face recognition, but an old problem still persists: accuracy is usually higher for men than women. It is often speculated that lower accuracy for women is caused by under-representation in the training data. This work investigates female under-representation in the training data is truly the cause of lower accuracy for females on test data. Using a state-of-the-art deep CNN, three different loss functions, and two training datasets, we train each on seven subsets with different male/female ratios, totaling forty two trainings, that are tested on three different datasets. Results show that (1) gender balance in the training data does not translate into gender balance in the test accuracy, (2) the “gender gap” in test accuracy is not minimized by a gender-balanced training set, but by a training set with more male images than female images, and (3) training to minimize the accuracy gap does not result in highest female, male or average accuracy.","PeriodicalId":417270,"journal":{"name":"2020 IEEE International Joint Conference on Biometrics (IJCB)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"43","resultStr":"{\"title\":\"How Does Gender Balance In Training Data Affect Face Recognition Accuracy?\",\"authors\":\"Vítor Albiero, Kai Zhang, K. Bowyer\",\"doi\":\"10.1109/IJCB48548.2020.9304924\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning methods have greatly increased the accuracy of face recognition, but an old problem still persists: accuracy is usually higher for men than women. It is often speculated that lower accuracy for women is caused by under-representation in the training data. This work investigates female under-representation in the training data is truly the cause of lower accuracy for females on test data. Using a state-of-the-art deep CNN, three different loss functions, and two training datasets, we train each on seven subsets with different male/female ratios, totaling forty two trainings, that are tested on three different datasets. Results show that (1) gender balance in the training data does not translate into gender balance in the test accuracy, (2) the “gender gap” in test accuracy is not minimized by a gender-balanced training set, but by a training set with more male images than female images, and (3) training to minimize the accuracy gap does not result in highest female, male or average accuracy.\",\"PeriodicalId\":417270,\"journal\":{\"name\":\"2020 IEEE International Joint Conference on Biometrics (IJCB)\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"43\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Joint Conference on Biometrics (IJCB)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IJCB48548.2020.9304924\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Joint Conference on Biometrics (IJCB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCB48548.2020.9304924","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 43

摘要

深度学习方法大大提高了人脸识别的准确性，但一个老问题仍然存在:男性的准确率通常高于女性。人们经常推测，女性的准确率较低是由于训练数据中代表性不足造成的。这项工作调查了女性在训练数据中的代表性不足是女性在测试数据中准确性较低的真正原因。使用最先进的深度CNN，三个不同的损失函数和两个训练数据集，我们在七个具有不同男女比例的子集上训练每个子集，总共42个训练，在三个不同的数据集上进行测试。结果表明:(1)训练数据中的性别平衡不会转化为测试准确率的性别平衡;(2)性别平衡的训练集不会最小化测试准确率的“性别差距”，而是通过男性图像多于女性图像的训练集;(3)最小化准确率差距的训练不会产生最高的女性、男性或平均准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

How Does Gender Balance In Training Data Affect Face Recognition Accuracy?

Deep learning methods have greatly increased the accuracy of face recognition, but an old problem still persists: accuracy is usually higher for men than women. It is often speculated that lower accuracy for women is caused by under-representation in the training data. This work investigates female under-representation in the training data is truly the cause of lower accuracy for females on test data. Using a state-of-the-art deep CNN, three different loss functions, and two training datasets, we train each on seven subsets with different male/female ratios, totaling forty two trainings, that are tested on three different datasets. Results show that (1) gender balance in the training data does not translate into gender balance in the test accuracy, (2) the “gender gap” in test accuracy is not minimized by a gender-balanced training set, but by a training set with more male images than female images, and (3) training to minimize the accuracy gap does not result in highest female, male or average accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE International Joint Conference on Biometrics (IJCB)

自引率

0.00%

发文量