Jiang Zhang, Ivan Beschastnikh, Sergey Mechtaev, Abhik Roychoudhury
{"title":"Fair Decision Making via Automated Repair of Decision Trees","authors":"Jiang Zhang, Ivan Beschastnikh, Sergey Mechtaev, Abhik Roychoudhury","doi":"10.1145/3524491.3527306","DOIUrl":"https://doi.org/10.1145/3524491.3527306","url":null,"abstract":"Data-driven decision-making allows more resource allocation tasks to be done by programs. Unfortunately, real-life training datasets may capture human biases, and the learned models can be unfair. To resolve this, one could either train a new, fair model from scratch or repair an existing unfair model. The former approach is liable for unbounded semantic difference, hence is unsuitable for social or legislative decisions. Meanwhile, the scalability of state-of-the-art model repair techniques is unsatisfactory. In this paper, we aim to automatically repair unfair decision models by converting any decision tree or random forest into a fair one with respect to a specific dataset and sensitive attributes. We built the FairRepair tool, inspired by automated program repair techniques for traditional programs. It uses a MaxSMT solver to decide which paths in the decision tree could be flipped or refined, with both fairness and semantic difference as hard constraints. Our approach is sound and complete, and the output repair always satisfies the desired fairness and semantic difference requirements. FairRepair is able to repair an unfair decision tree on the well-known COMPAS dataset [2] in 1 minute on average, achieving 90.3% fairness and only 2.3% semantic difference. We compared FairRepair with 4 state-of-the-art fairness learning algorithms [10, 13, 16, 18]. While achieving similar fairness by training new models, they incur 8.9% to 13.5% semantic difference. These results show that FairRepair is capable of repairing an unfair model while maintaining the accuracy and incurring small semantic difference. CCS CONCEPTS • Computing methodologies → Philosophical/theoretical foundations of artificial intelligence; • Social and professional topics → Race and ethnicity.","PeriodicalId":287874,"journal":{"name":"2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128034240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sebastien Delecraz, Loukman Eltarr, Martin Becuwe, Henri Bouxin, Nicolas Boutin, O. Oullier
{"title":"Making Recruitment More Inclusive: Unfairness Monitoring With A Job Matching Machine-Learning Algorithm","authors":"Sebastien Delecraz, Loukman Eltarr, Martin Becuwe, Henri Bouxin, Nicolas Boutin, O. Oullier","doi":"10.1145/3524491.3527309","DOIUrl":"https://doi.org/10.1145/3524491.3527309","url":null,"abstract":"For decades human resources management has relied on recruitment processes rooted in self-reports (such as surveys, questionnaires, personality and cognitive tests) and interviews that were, for most of them, lacking scientific rigor and replicability. Here, we introduce an algorithm that matches job offers and workers that not only outperforms the classic recruitment and job matching methods, but that has at its core algorithmic safeguards to prevent (as much as possible) the pitfall of unfairness and discrimination. Our approach to algorithm development is guided by the constant goal of offering a solution at the cutting edge of technology that has at its core a strict policy of being as fair, inclusive and transparent as possible. ACM Reference Format: Sebastien Delecraz, Loukman Eltarr, Martin Becuwe, Henri Bouxin, Nicolas Boutin, and Olivier Oullier. 2022. Making Recruitment More Inclusive: Unfairness Monitoring With A Job Matching Machine-Learning Algorithm. In International Workshop on Equitable Data and Technology (FairWare ’22), May 9, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3524491.3527309","PeriodicalId":287874,"journal":{"name":"2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare)","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132925704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting Obstacles to Collaboration in an Online Participatory Democracy Platform: A Use-case Driven Analysis","authors":"William Aboucaya, Rafael Angarita, V. Issarny","doi":"10.1145/3524491.3527307","DOIUrl":"https://doi.org/10.1145/3524491.3527307","url":null,"abstract":"Massive online participatory platforms are an essential tool for involving citizens in public decision-making on a large scale, both in terms of the number of participating citizens and their geographical distribution. However, engaging a sufficiently large number of citizens, as well as collecting adequate contributions, require special attention in the functionalities implemented by the platform. This paper empirically analyzes the existing flaws in participatory platforms and their impact on citizen participation. We focus specifically on the citizen consultation “République Numérique” (Digital Republic) to identify issues arising from the interactions between users on the supporting platform. We chose this consultation because of the high number of contributors and contributions, and the various means of interaction it proposes. Through an analysis of the available data, we highlight that contributions tend to be concentrated around a small set of proposals and contributors. This leads us to formulate a number of recommendations for the design of participatory platforms regarding the management of contributions, from their organization to their presentation to users. CCS CONCEPTS • Applied computing → E-government; • Human-centered computing → Empirical studies in interaction design; Collaborative and social computing systems and tools; • General and reference → Empirical studies. ACM Reference Format: William Aboucaya, Rafael Angarita, and Valérie Issarny. 2022. Detecting Obstacles to Collaboration in an Online Participatory Democracy Platform: A Use-case Driven Analysis. In International Workshop on Equitable Data and Technology (FairWare ’22), May 9, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/3524491.3527307","PeriodicalId":287874,"journal":{"name":"2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare)","volume":"181 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116415668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Privileged and Unprivileged Groups: An Empirical Study on the Impact of the Age Attribute on Fairness","authors":"Max Hort, Federica Sarro","doi":"10.1145/3524491.3527308","DOIUrl":"https://doi.org/10.1145/3524491.3527308","url":null,"abstract":"Recent advances in software fairness investigate bias in the treatment of different population groups, which are devised based on attributes such as gender, race and age. Groups are divided into privileged groups (favourable treatment) and unprivileged groups (unfavourable treatment). To truthfully represent the real world and to measure the degree of bias according to age (young vs. old), one needs to pick a threshold to separate those groups. In this study we investigate two popular datasets (i.e., German and Bank) and the bias observed when using every possible age threshold in order to divide the population into “young” and “old” groups, in combination with three different Machine Learning models (i.e., Logistic Regression, Decision Tree, Support Vector Machine). Our results show that age thresholds do not only impact the intensity of bias in these datasets, but also the direction (i.e., which population group receives a favourable outcome). For the two investigated datasets, we present a selection of suitable age thresholds. We also found strong and very strong correlations between the dataset bias and the respective bias of trained classification models, in 83% of the cases studied. CCS CONCEPTS • Social and professional topics → User characteristics; • General and reference → Empirical studies. ACM Reference Format: Max Hort and Federica Sarro. 2022. Privileged and Unprivileged Groups: An Empirical Study on the Impact of the Age Attribute on Fairness. In International Workshop on Equirable Data and Technology (FairWare ’22), May 9, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3524491.3527308","PeriodicalId":287874,"journal":{"name":"2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129527487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fair-SSL: Building fair ML Software with less data","authors":"Joymallya Chakraborty, Suvodeep Majumder, Huy Tu","doi":"10.1145/3524491.3527305","DOIUrl":"https://doi.org/10.1145/3524491.3527305","url":null,"abstract":"Ethical bias in machine learning models has become a matter of concern in the software engineering community. Most of the prior software engineering works concentrated on finding ethical bias in models rather than fixing it. After finding bias, the next step is mitigation. Prior researchers mainly tried to use supervised approaches to achieve fairness. However, in the real world, getting data with trustworthy ground truth is challenging and also ground truth can contain human bias. Semi-supervised learning is a technique where, incrementally, labeled data is used to generate pseudo-labels for the rest of data (and then all that data is used for model training). In this work, we apply four popular semi-supervised techniques as pseudo-labelers to create fair classification models. Our framework, Fair-SSL, takes a very small amount (10%) of labeled data as input and generates pseudo-labels for the unlabeled data. We then synthetically generate new data points to balance the training data based on class and protected attribute as proposed by Chakraborty et al. in FSE 2021. Finally, classification model is trained on the balanced pseudo-labeled data and validated on test data. After experimenting on ten datasets and three learners, we find that Fair-SSL achieves similar performance as three state-of-the-art bias mitigation algorithms. That said, the clear advantage of Fair-SSL is that it requires only 10% of the labeled training data. To the best of our knowledge, this is the first SE work where semi-supervised techniques are used to fight against ethical bias in SE ML models. To facilitate open science and replication, all our source code and datasets are publicly available at https://github.com/joymallyac/FairSSL. CCS CONCEPTS • Software and its engineering → Software creation and management; • Computing methodologies → Machine learning. ACM Reference Format: Joymallya Chakraborty, Suvodeep Majumder, and Huy Tu. 2022. Fair-SSL: Building fair ML Software with less data. In International Workshop on Equitable Data and Technology (FairWare ‘22), May 9, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3524491.3527305","PeriodicalId":287874,"journal":{"name":"2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132167958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}