Tom Farrand, FatemehSadat Mireshghallah, Sahib Singh, Andrew Trask
{"title":"非隐私非公平:数据不平衡对差异隐私中效用与公平的影响","authors":"Tom Farrand, FatemehSadat Mireshghallah, Sahib Singh, Andrew Trask","doi":"10.1145/3411501.3419419","DOIUrl":null,"url":null,"abstract":"Deployment of deep learning in different fields and industries is growing day by day due to its performance, which relies on the availability of data and compute. Data is often crowd-sourced and contains sensitive information about its contributors, which leaks into models that are trained on it. To achieve rigorous privacy guarantees, differentially private training mechanisms are used. However, it has recently been shown that differential privacy can exacerbate existing biases in the data and have disparate impacts on the accuracy of different subgroups of data. In this paper, we aim to study these effects within differentially private deep learning. Specifically, we aim to study how different levels of imbalance in the data affect the accuracy and the fairness of the decisions made by the model, given different levels of privacy. We demonstrate that even small imbalances and loose privacy guarantees can cause disparate impacts.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":"{\"title\":\"Neither Private Nor Fair: Impact of Data Imbalance on Utility and Fairness in Differential Privacy\",\"authors\":\"Tom Farrand, FatemehSadat Mireshghallah, Sahib Singh, Andrew Trask\",\"doi\":\"10.1145/3411501.3419419\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deployment of deep learning in different fields and industries is growing day by day due to its performance, which relies on the availability of data and compute. Data is often crowd-sourced and contains sensitive information about its contributors, which leaks into models that are trained on it. To achieve rigorous privacy guarantees, differentially private training mechanisms are used. However, it has recently been shown that differential privacy can exacerbate existing biases in the data and have disparate impacts on the accuracy of different subgroups of data. In this paper, we aim to study these effects within differentially private deep learning. Specifically, we aim to study how different levels of imbalance in the data affect the accuracy and the fairness of the decisions made by the model, given different levels of privacy. We demonstrate that even small imbalances and loose privacy guarantees can cause disparate impacts.\",\"PeriodicalId\":116231,\"journal\":{\"name\":\"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"49\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3411501.3419419\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3411501.3419419","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Neither Private Nor Fair: Impact of Data Imbalance on Utility and Fairness in Differential Privacy
Deployment of deep learning in different fields and industries is growing day by day due to its performance, which relies on the availability of data and compute. Data is often crowd-sourced and contains sensitive information about its contributors, which leaks into models that are trained on it. To achieve rigorous privacy guarantees, differentially private training mechanisms are used. However, it has recently been shown that differential privacy can exacerbate existing biases in the data and have disparate impacts on the accuracy of different subgroups of data. In this paper, we aim to study these effects within differentially private deep learning. Specifically, we aim to study how different levels of imbalance in the data affect the accuracy and the fairness of the decisions made by the model, given different levels of privacy. We demonstrate that even small imbalances and loose privacy guarantees can cause disparate impacts.